Re: [PATCH v2] arm64: dts: rockchip: rk356x: Fix PCIe register map and ranges

From: Robin Murphy
Date: Tue Oct 25 2022 - 06:30:44 EST


On 2022-10-24 21:16, Peter Geis wrote:
On Mon, Oct 24, 2022 at 7:05 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:

On 2022-10-22 18:24, Mark Kettenis wrote:
From: Peter Geis <pgwipeout@xxxxxxxxx>
Date: Sat, 22 Oct 2022 08:19:57 -0400

Hello Peter,

On Fri, Oct 21, 2022 at 4:52 PM Mark Kettenis <mark.kettenis@xxxxxxxxx> wrote:

Date: Fri, 21 Oct 2022 21:32:48 +0200
From: Ondřej Jirman <megi@xxxxxx>

On Fri, Oct 21, 2022 at 12:48:15PM -0400, Peter Geis wrote:
On Fri, Oct 21, 2022 at 11:39 AM Ondřej Jirman <megi@xxxxxx> wrote:

On Fri, Oct 21, 2022 at 09:07:50AM -0400, Peter Geis wrote:
Good Morning Heiko,

Apologies for just getting to this, I'm still in the middle of moving
and just got my lab set back up.

I've tested this patch series and it leads to the same regression with
NVMe drives. A loop of md5sum on two identical 4GB random files
produces the following results:
d11cf0caa541b72551ca22dc5bef2de0 test-rand.img
fad97e91da8d4fd554c895cafa89809b test-rand2.img
2d56a7baa05c38535f4c19a2b371f90a test-rand.img
74e8e6f93d7c3dc3ad250e91176f5901 test-rand2.img
25cfcfecf4dd529e4e9fbbe2be482053 test-rand.img
74e8e6f93d7c3dc3ad250e91176f5901 test-rand2.img
b9637505bf88ed725f6d03deb7065dab test-rand.img
f7437e88d524ea92e097db51dce1c60d test-rand2.img

Before this patch series:
d11cf0caa541b72551ca22dc5bef2de0 test-rand.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand2.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand2.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand2.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand.img
d11cf0caa541b72551ca22dc5bef2de0 test-rand2.img

Though I do love where this patch is going and would like to see if it
can be made to work, in its current form it does not.

Thanks for the test. Can you please also test v1? Also please share lspci -vvv
of your nvme drive, so that we can see allocated address ranges, etc.

Good catch, with your patch as is, the following issue crops up:
Region 0: Memory at 300000000 (64-bit, non-prefetchable) [size=16K]
Region 2: I/O ports at 1000 [disabled] [size=256]

However, with a simple fix, we can get this:
Region 0: Memory at 300000000 (64-bit, non-prefetchable) [virtual] [size=16K]
Region 2: I/O ports at 1000 [virtual] [size=256]

and with it a working NVMe drive.

Change the following range:
0x02000000 0x0 0x40000000 0x3 0x00000000 0x0 0x40000000>;
to
0x02000000 0x0 0x00000000 0x3 0x00000000 0x0 0x40000000>;

I've already tried this, but this unfrotunately breaks the wifi cards.
(those only use the I/O space) Maybe because I/O and memory address spaces
now overlap, I don't know. That's why I used the 1GiB offset for memory
space.

Meanwhile, I have an NVMe drive that only works if mmio is completely
untranslated. This is an ADATA SX8000NP drive, which uses a Silicon
Motion SM2260 controller.

So for me, a working configuration has the following "ranges":

ranges = <0x01000000 0x0 0x00000000 0x3 0x3fff0000 0x0 0x00010000>,
<0x02000000 0x0 0xf4000000 0x0 0xf4000000 0x0 0x02000000>,
<0x03000000 0x3 0x10000000 0x3 0x10000000 0x0 0x2fff0000>;

This also needs changes to the "reg" propery:

reg = <0x3 0xc0000000 0x0 0x00400000>,
<0x0 0xfe260000 0x0 0x00010000>,
<0x3 0x00000000 0x0 0x10000000>;

Now this is interesting. I've been reading up on PCIe ranges and what
is necessary for things to work properly, and I found this interesting
article from ARM:
https://developer.arm.com/documentation/102337/0000/Programmers-model/Memory-maps/AP-system-memory-map/PCIe-MMIO-and-ECAM-memory-regions

TLDR: We need a low region (below 4g) and a high region.

Well, that description applies to a specific ARM reference design.
And it appears that the PCIe-RC used in that reference design does not
support address translation.

Indeed, that's not an "interesting article", it's just documentation for
some other system that isn't this one. In fact it's a system that
strictly doesn't even *have* PCIe; the reference designs are not
complete SoCs, and all that is being described there is the interconnect
address map for the parts which are in place ready for a customer to
stitch their choice of PCIe implementation to.

The equivalent for RK3568 is that you *do* have "low" and "high" PCIe
windows at 0xfx000000 and 0x3xxx00000 respectively in the system
interconnect address map. How the PCIe controllers choose to relate
those system MMIO addresses to those to PCI Memory, I/O and Config space
addresses is another matter entirely.

Unfortunately we are working with insufficient documentation and
without the detailed understanding of a system integrator here. I'm
fully aware that the Neoverse N2 is not the rk3568, however
significant chunks of the rk3568 are based on ARM IP. Looking at how
ARM expects things to work by comparing their reference documents to
the hardware we have on hand is helpful in determining what we are
lacking.

The specific portions of the documentation that I found useful are not
the memory maps, but the generic descriptions of expected PCIe
regions. Combining those with other reference documents (unfortunately
most x86 based, but we have the unfortunate reality that PCIe has a
lot of x86isms to deal with) is quite enlightening.

OK, but you're looking at the wrong place for that. The only actual relevant reference would be rule PCI_MM_06 in the BSA[1], which says that PCI memory space should not be translated relative to the system address map. It is hopefully obvious that 32-bit devices need 32-bit PCI mem space to assign to their BARs, thus it falls out that if there is no translation, that requires a 32-bit window in system address space too.

That is of course speaking of a BSA-compliant system. Vendors are still free to not care about BSA and do whatever the heck they want.

Thanks,
Robin.

[1] https://developer.arm.com/documentation/den0094/latest/

I've been pinging
various representatives of the IP and implementation on the mailing
list about these issues for about a year now with no responses from
the Designware folk. You have been pretty one of the only individuals
with the level of knowledge we need to respond and I thank you for
that.

Based on what I've read I suspect that at least one of the two
following statements is true:
a. Mark is correct that translation is broken in Rockchip's
implementation (unknown if this is a SoC or a driver issue)
b. We do in fact require IO and Config to be 32 bit addressable to be
fully compatible.

These issues are compounded in rk3588 where we have much smaller
regions in the 32bit space for PCIe, so a definite answer on the true
requirements and limitations would be quite helpful.

As always, thank you for your time,
Peter


Robin.