[PATCH RFC v2] arm64: dts: lx2160a: extend 32-bit, and add 64-bit pci regions

From: Josua Mayer
Date: Mon Apr 29 2024 - 14:24:56 EST


LX2160 SoC pci-e controller supports 64-bit memory regions up to 16GB,
32-bit regions up to 3GB and 16-bit regions up to 64k.

For each pci-e controller:
- extend the existing 32-bit regions to 3GB size
- add 16-bit region
- add 64-bit region

Same memory allocation with similar flags were been tested with UEFI
and ACPI on pcie3 and pcie5.
This specific device-tree configuration was tested with nxp lsdk-21.08
based u-boot:
- pcie5 with a Radeon Pro WX2100 with Gnome Desktop
- pcie3 with an ADATA NVME

Fixes allocation of large, and 64-bit BARs as requested by many pci
cards, especially graphics processors or AI accelerators, e.g.:
[ 2.941187] pci 0000:01:00.0: BAR 0: no space for [mem size 0x200000000 64bit pref]
[ 2.948834] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x200000000 64bit pref]

This version is still marked RFC because as it carries a workaround for
a limitation of the designware pcie controller driver.
The ATU has a maximum allocation size of 4GB. Larger allocations should
be implemented as multiple allocations of 4GB in the driver,
similar to how UEFI implemented it for ACPI.

Signed-off-by: Josua Mayer <josua@xxxxxxxxxxxxx>
---
Changes in v2:
- adjusted flags to fix several errors during probe and bar allocation
- explicitly tested with 2 pci cards on Debian (Linux 6.1)
- still rfc because a limitation in designware pci driver
- Link to v1: https://lore.kernel.org/r/20240321-lx2160-pci-v1-1-3673708f7eb6@xxxxxxxxxxxxx
---
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi | 55 +++++++++++++++++++++++---
1 file changed, 49 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
index 6640b49670ae..ec4e6252f83b 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
@@ -1,3 +1,4 @@
+
// SPDX-License-Identifier: (GPL-2.0 OR MIT)
//
// Device Tree Include file for Layerscape-LX2160A family SoC.
@@ -1134,7 +1135,14 @@ pcie1: pcie@3400000 {
apio-wins = <8>;
ppio-wins = <8>;
bus-range = <0x0 0xff>;
- ranges = <0x82000000 0x0 0x40000000 0x80 0x40000000 0x0 0x40000000>; /* non-prefetchable memory */
+ // ranges = <0x02102000 0x84 0x00000000 0x84 0x00000000 0x04 0x00000000>, /* 64-Bit - prefetchable */
+ /* split 64-bit area into 4GB chunks as workaround for ATU max allocation size */
+ ranges = <0x02102000 0x87 0x00000000 0x87 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x86 0x00000000 0x86 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x85 0x00000000 0x85 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x84 0x00000000 0x84 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02000200 0x00 0x40000000 0x80 0x40000000 0x00 0xc0000000>, /* 32-Bit - non-prefetchable */
+ <0x01200100 0x00 0x00000000 0x80 0x10000000 0x00 0x00010000>; /* 16-Bit IO Window */
msi-parent = <&its>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;
@@ -1162,7 +1170,14 @@ pcie2: pcie@3500000 {
apio-wins = <8>;
ppio-wins = <8>;
bus-range = <0x0 0xff>;
- ranges = <0x82000000 0x0 0x40000000 0x88 0x40000000 0x0 0x40000000>; /* non-prefetchable memory */
+ // ranges = <0x02102000 0x8c 0x00000000 0x8c 0x00000000 0x04 0x00000000>, /* 64-Bit - prefetchable */
+ /* split 64-bit area into 4GB chunks as workaround for ATU max allocation size */
+ ranges = <0x02102000 0x8f 0x00000000 0x8f 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x8e 0x00000000 0x8e 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x8d 0x00000000 0x8d 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x8c 0x00000000 0x8c 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02000200 0x00 0x40000000 0x88 0x40000000 0x00 0xc0000000>, /* 32-Bit - non-prefetchable */
+ <0x01200100 0x00 0x00000000 0x88 0x10000000 0x00 0x00010000>; /* 16-Bit IO Window */
msi-parent = <&its>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;
@@ -1190,7 +1205,14 @@ pcie3: pcie@3600000 {
apio-wins = <256>;
ppio-wins = <24>;
bus-range = <0x0 0xff>;
- ranges = <0x82000000 0x0 0x40000000 0x90 0x40000000 0x0 0x40000000>; /* non-prefetchable memory */
+ // ranges = <0x02102000 0x94 0x00000000 0x94 0x00000000 0x04 0x00000000>, /* 64-Bit - prefetchable */
+ /* split 64-bit area into 4GB chunks as workaround for ATU max allocation size */
+ ranges = <0x02102000 0x97 0x00000000 0x97 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x96 0x00000000 0x96 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x95 0x00000000 0x95 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x94 0x00000000 0x94 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02000200 0x00 0x40000000 0x90 0x40000000 0x00 0xc0000000>, /* 32-Bit - non-prefetchable */
+ <0x01200100 0x00 0x00000000 0x90 0x10000000 0x00 0x00010000>; /* 16-Bit IO Window */
msi-parent = <&its>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;
@@ -1218,7 +1240,14 @@ pcie4: pcie@3700000 {
apio-wins = <8>;
ppio-wins = <8>;
bus-range = <0x0 0xff>;
- ranges = <0x82000000 0x0 0x40000000 0x98 0x40000000 0x0 0x40000000>; /* non-prefetchable memory */
+ // ranges = <0x02102000 0x9c 0x00000000 0x9c 0x00000000 0x04 0x00000000>, /* 64-Bit - prefetchable */
+ /* split 64-bit area into 4GB chunks as workaround for ATU max allocation size */
+ ranges = <0x02102000 0x9f 0x00000000 0x9f 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x9e 0x00000000 0x9e 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x9d 0x00000000 0x9d 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0x9c 0x00000000 0x9c 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02000200 0x00 0x40000000 0x98 0x40000000 0x00 0xc0000000>, /* 32-Bit - non-prefetchable */
+ <0x01200100 0x00 0x00000000 0x98 0x10000000 0x00 0x00010000>; /* 16-Bit IO Window */
msi-parent = <&its>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;
@@ -1246,7 +1275,14 @@ pcie5: pcie@3800000 {
apio-wins = <256>;
ppio-wins = <24>;
bus-range = <0x0 0xff>;
- ranges = <0x82000000 0x0 0x40000000 0xa0 0x40000000 0x0 0x40000000>; /* non-prefetchable memory */
+ // ranges = <0x02102000 0xa4 0x00000000 0xa4 0x00000000 0x04 0x00000000>, /* 64-Bit - prefetchable */
+ /* split 64-bit area into 4GB chunks as workaround for ATU max allocation size */
+ ranges = <0x02102000 0xa7 0x00000000 0xa7 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0xa6 0x00000000 0xa6 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0xa5 0x00000000 0xa5 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0xa4 0x00000000 0xa4 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02000200 0x00 0x40000000 0xa0 0x40000000 0x00 0xc0000000>, /* 32-Bit - non-prefetchable */
+ <0x01200100 0x00 0x00000000 0xa0 0x10000000 0x00 0x00010000>; /* 16-Bit IO Window */
msi-parent = <&its>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;
@@ -1274,7 +1310,14 @@ pcie6: pcie@3900000 {
apio-wins = <8>;
ppio-wins = <8>;
bus-range = <0x0 0xff>;
- ranges = <0x82000000 0x0 0x40000000 0xa8 0x40000000 0x0 0x40000000>; /* non-prefetchable memory */
+ // ranges = <0x02102000 0xac 0x00000000 0xac 0x00000000 0x04 0x00000000>, /* 64-Bit - prefetchable */
+ /* split 64-bit area into 4GB chunks as workaround for ATU max allocation size */
+ ranges = <0x02102000 0xaf 0x00000000 0xaf 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0xae 0x00000000 0xae 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0xad 0x00000000 0xad 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02102000 0xac 0x00000000 0xac 0x00000000 0x01 0x00000000>, /* 64-Bit - prefetchable - 4GB chunk */
+ <0x02000200 0x00 0x40000000 0xa8 0x40000000 0x00 0xc0000000>, /* 32-Bit - non-prefetchable */
+ <0x01200100 0x00 0x00000000 0xa8 0x10000000 0x00 0x00010000>; /* 16-Bit IO Window */
msi-parent = <&its>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;

---
base-commit: e8f897f4afef0031fe618a8e94127a0934896aba
change-id: 20240118-lx2160-pci-4bdb196e58f3

Sincerely,
--
Josua Mayer <josua@xxxxxxxxxxxxx>