Re: [PATCH 04/11] of: address: Preserve the flags portion on 1:1 dma-ranges mapping

From: Rob Herring
Date: Thu Aug 29 2024 - 09:18:42 EST


On Thu, Aug 29, 2024 at 5:13 AM Andrea della Porta
<andrea.porta@xxxxxxxx> wrote:
>
> Hi Rob,

BTW, I noticed your email replies set "reply-to" to everyone in To and
Cc. The result (with Gmail) is my reply lists everyone twice (in both
To and Cc). "reply-to" is just supposed to be the 1 address you want
replies sent to instead of the "from" address.

> On 16:29 Mon 26 Aug , Rob Herring wrote:
> > On Wed, Aug 21, 2024 at 3:19 AM Andrea della Porta
> > <andrea.porta@xxxxxxxx> wrote:
> > >
> > > Hi Rob,
> > >
> > > On 19:16 Tue 20 Aug , Rob Herring wrote:
> > > > On Tue, Aug 20, 2024 at 04:36:06PM +0200, Andrea della Porta wrote:
> > > > > A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
> > > > > translations. In this specific case, rhe current behaviour is to zero out
> > > >
> > > > typo
> > >
> > > Fixed, thanks!
> > >
> > > >
> > > > > the entire specifier so that the translation could be carried on as an
> > > > > offset from zero. This includes address specifier that has flags (e.g.
> > > > > PCI ranges).
> > > > > Once the flags portion has been zeroed, the translation chain is broken
> > > > > since the mapping functions will check the upcoming address specifier
> > > >
> > > > What does "upcoming address" mean?
> > >
> > > Sorry for the confusion, this means "address specifier (with valid flags) fed
> > > to the translating functions and for which we are looking for a translation".
> > > While this address has some valid flags set, it will fail the translation step
> > > since the ranges it is matched against have flags zeroed out by the 1:1 mapping
> > > condition.
> > >
> > > >
> > > > > against mismatching flags, always failing the 1:1 mapping and its entire
> > > > > purpose of always succeeding.
> > > > > Set to zero only the address portion while passing the flags through.
> > > >
> > > > Can you point me to what the failing DT looks like. I'm puzzled how
> > > > things would have worked for anyone.
> > > >
> > >
> > > The following is a simplified and lightly edited) version of the resulting DT
> > > from RPi5:
> > >
> > > pci@0,0 {
> > > #address-cells = <0x03>;
> > > #size-cells = <0x02>;
> > > ......
> > > device_type = "pci";
> > > compatible = "pci14e4,2712\0pciclass,060400\0pciclass,0604";
> > > ranges = <0x82000000 0x00 0x00 0x82000000 0x00 0x00 0x00 0x600000>;
> > > reg = <0x00 0x00 0x00 0x00 0x00>;
> > >
> > > ......
> > >
> > > rp1@0 {
> >
> > What does 0 represent here? There's no 0 address in 'ranges' below.
> > Since you said the parent is a PCI-PCI bridge, then the unit-address
> > would have to be the PCI devfn and you are missing 'reg' (or omitted
> > it).
>
> There's no reg property because the registers for RP1 are addressed
> starting at 0x40108000 offset from BAR1. The devicetree specs says
> that a missing reg node should not have any unit address specified
> (and AFAIK there's no other special directives for simple-bus specified
> in dt-bindings).
> I've added @0 just to get rid of the following warning:
>
> Warning (unit_address_vs_reg): /fragment@0/__overlay__/rp1: node has
> a reg or ranges property, but no unit name

It's still wrong as dtc only checks the unit-address is correct in a
few cases with known bus types.

> coming from make W=1 CHECK_DTBS=y broadcom/rp1.dtbo.
> This is the exact same approach used by Bootlin patchset from:
>
> https://lore.kernel.org/all/20240808154658.247873-2-herve.codina@xxxxxxxxxxx/

It is not. First, that has a node for the PCI device (i.e. the
LAN966x). You do not. You only have a PCI-PCI bridge and that is
wrong.

BTW, you should Cc Herve and others that are working on this feature.
It is by no means fully sorted as you have found.

> replied here below for convenience:
>
> + pci-ep-bus@0 {
> + compatible = "simple-bus";
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + /*
> + * map @0xe2000000 (32MB) to BAR0 (CPU)
> + * map @0xe0000000 (16MB) to BAR1 (AMBA)
> + */
> + ranges = <0xe2000000 0x00 0x00 0x00 0x2000000

The 0 parent address here matches the unit-address, so all good in this case.

> + 0xe0000000 0x01 0x00 0x00 0x1000000>;
>
> Also, I think it's not possible to know the devfn in advance, since the
> DT part is pre-compiled as an overlay while the devfn number is coming from
> bus enumeration.

No. devfn is fixed unless you are plugging in a card in different
slots. The bus number is the part that is not known and assigned by
the OS, but you'll notice that is omitted.

In any case, the RP1 node should be generated, so its devfn is irrelevant.

> Since the registers for sub-peripherals will start (as stated in ranges
> property) from 0xc040000000, I'd be inclined to use rp1@c040000000 as the
> node name and address unit. Is it feasible?

Yes, but that would be in nodes underneath ranges. Above, it is the
parent bus we are talking about.

> > > #address-cells = <0x02>;
> > > #size-cells = <0x02>;
> > > compatible = "simple-bus";
> >
> > The parent is a PCI-PCI bridge. Child nodes have to be PCI devices and
> > "simple-bus" is not a PCI device.
>
> The simple-bus is needed to automatically traverse and create platform
> devices in of_platform_populate(). It's true that RP1 is a PCI device,
> but sub-peripherals of RP1 are platform devices so I guess this is
> unavoidable right now.

You are missing the point. A PCI-PCI bridge does not have a
simple-bus. However, I think it's just what you pasted here that's
wrong. From the looks of the RP1 driver and the overlay, it should be
correct.

It would also help if you dumped out what "lspci -tvnn" prints.

> > The assumption so far with all of this is that you have some specific
> > PCI device (and therefore a driver). The simple-buses under it are
> > defined per BAR. Not really certain if that makes sense in all cases,
> > but since the address assignment is dynamic, it may have to. I'm also
> > not completely convinced we should reuse 'simple-bus' here or define
> > something specific like 'pci-bar-bus' or something.
>
> Good point. Labeling a new bus for this kind of 'appliance' could be
> beneficial to unify the dt overlay approach, and I guess it could be
> adopted by the aforementioned Bootlin's Microchip patchset too.
> However, since the difference with simple-bus would be basically non
> existent, I believe that this could be done in a future patch due to
> the fact that the dtbo is contained into the driver itself, so we do
> not suffer from the proliferation that happens when dtb are managed
> outside.

It's an ABI, so we really need to decide first.

> > > ranges = <0xc0 0x40000000 0x01 0x00 0x00 0x00 0x400000>;
> > > dma-ranges = <0x10 0x00 0x43000000 0x10 0x00 0x10 0x00>;
> > > ......
> > > };
> > > };
> > >
> > > The pci@0,0 bridge node is automatically created by virtue of
> > > CONFIG_PCI_DYNAMIC_OF_NODES, and has no dma-ranges, hence it implies 1:1 dma
> > > mappings (flags for this mapping are set to zero). The rp1@0 node has
> > > dma-ranges with flags set (0x43000000). Since 0x43000000 != 0x00 any translation
> > > will fail.
> >
> > It's possible that we should fill in 'dma-ranges' when making these
> > nodes rather than supporting missing dma-ranges here.
>
> I really think that filling dma-ranges for dynamically created pci
> nodes would be the correct approach.
> However, IMHO this does not imply that we could let inconsistent
> address (64 bit addr with 32 flag bit set) laying around the
> translation chain, and fixing that is currently working fine. I'd
> be then inclined to say the proposed change is outside the scope
> of the present patchset and to postpone it to a future patch.

Okay, but let's fix it with a test case. There's already a test case
for all this in the DT unittest which can be extended.

Rob