Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
From: Marek Vasut
Date: Sun May 03 2026 - 19:54:25 EST
On 4/28/26 9:00 AM, Koichiro Den wrote:
Hello Den-san,
The patch makes sense to me. Let me ask two questions:
1. Could r8a779f0 (R-Car S4-8) be handled as well, perhaps by adding a separate
.additional_common_init() implementation for it?
As far as I can see, the r8a779f0 match data currently does not use
rcar_gen4_pcie_additional_common_init().
I will address this one in V2, thank you for pointing that out.
2. Did you also happen to test V4H/V4M in endpoint (EP) mode, with the local
eDMA engine issuing MRd requests toward host memory?
I was not able to test this configuration.
Is it possible to perform this test with a single device, by having the eDMA do local-memory-read-to-local-memory-write transfers, maybe using PIPE_LOOPBACK/LOOPBACK_ENABLE bits, or do I need two devices with NTB connection between them ?
In case it is the later, could you please briefly describe the S4 NTB setup you use, so I could try to replicate it locally ?
Your commit message
describes an NVMe device as the requester, but I'm wondering whether the same
256B limit was also verified for the R-Car EP DMA requester path.
This part I currently can not answer, I'm sorry.
...
I made the following two observations in the meantime.
First, I wrote two SSDs, Crucial P5 Plus SSD without HMPRE (without host memory buffer) and XPG GAMMIX P55 with HMPRE (with host memory buffer) with 4 GiB of random data on another system (iMX8M Plus, ARM64 with DWC PCIe controller too), then I did a read back and compared the data, the writen and read-back data matched.
Then I plugged both SSDs into V4H Sparrow Hawk _without_ this patch, and I did read back of data:
- Crucial P5 Plus SSD without HMPRE (without host memory buffer)
-> Data read back match data written on iMX8M Plus, OK
- XPG GAMMIX P55 with HMPRE (with host memory buffer)
-> Data read back match data written on iMX8M Plus, OK
Then I wrote 512 Byte of data into the Crucial P5 Plus SSD without HMPRE on V4H Sparrow Hawk and did read back again.
-> Data read back does NOT match data written, NG
That would indicate that:
- WRITE transfers from SSD to DRAM are OK
- READ transfers from DRAM to SSD are corrupted at 256 Bytes boundary
That would indicate that we need _at_least_ the 256 Bytes limit, likely on both MPS and MRRS.
Second, I got a report of another SSD for which this patch is not sufficient. I currently do not have access to that SSD, but I will ask for access and investigate. That may shed some light on the 128 Byte limit below.
(*) The background for my question 2:
I only have access to S4 Spider boards. In my RC <-> EP setup, where the EP
side uses the local eDMA engine to issue MRd requests toward the RC, 256-byte
MRd requests still appear to corrupt the transferred data.
Is the corruption deterministic in some way, i.e. are the same bytes of the transferred data corrupted every time, or is the corruption "random" ?
Does the corruption happen even on singular MRd transfer, or does it happen only when a lot of traffic is sent across the NTB link? I wonder if this corruption might be DRAM bandwidth related, i.e. whether the DMA does possibly saturate the DRAM controller with write requests and make the system run out of DRAM bandwidth.
With the following
change on top of your patch, my DMA-read tests become stable:
[...]
One detail which might be important is that limiting only MPS does not appear
to be sufficient in my setup. MPS=128B with MRRS=256B still seems broken,
while MPS=128B with MRRS=128B works fine. I wonder whether this is because
the "MPS" term in the min(MRRS, MPS) limit for DMA read transfers may
effectively be tied to the DMA read buffer segment size / MPSS rather than
only to DevCtl.MPS. I'm not sure about this yet though.
I think setting MPS=128B MRRS=256B only leads to the transfer being split into 2 x 128B TLPs sent across the PCIe link, but in the end, 2 x 128 Bytes of data are received (in some order) into the read segment buffer and reordered, and 1 x 256 Bytes are written from read segment buffer into the memory as a single write.
In case of MPS=256B MRRS=256B, only one 256B TLP is sent across the link, 1 x 256 Bytes of data are received into the read segment buffer with no reordering necessary, and 1 x 256 Bytes are still written from read segment buffer into the memory as a single write.
=> For MPS=128B/MPS=256B and MRRS=256B, there is difference in the
transfer format between PCIe and DMA, but there is no difference
between DMA and DRAM .
But in case of MRRS=128B and transfer of 256 Bytes, 2 x 128 Bytes of data are received into (separate? (*)) entries in read segment buffer, and 2 x 128 Bytes are written from (separate?) entries in read segment buffer into the memory as two separate writes . Could this different memory write pattern be responsible for the (lack of) corruption ?
Do you know whether the data are corrupted on the PCIe-to-DMA side (when the data are received from the PCIe side and written into the read buffer segment) or on the DMA-to-DRAM side (on read from read segment buffer or on write into DRAM) ?
(*) Since the read segment buffer has 16 x 256 Byte segments, with 16 DMA tags and never more than 16 MRd requests in flight, I think it is likely that each MRd data land in separate read segment buffer segment. But this information comes from another datasheet, not V4H one.
One more thing I noticed in the manuals:
R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025:
Type00 MPSS initial = 256B, PCI R, Internal R/W
Type01 MPSS initial = 128B, PCI R, Internal R
R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025
Type00 MPSS initial = 256B, PCI R, Internal R
Type01 MPSS initial = 128B, PCI R, Internal R/W
I'm still unsure, but this difference might be relevant. In particular, in
V4H/V4M RC mode your patch programs DevCtl.MPS to 256B, but does not change
Type01 MPSS. I wonder if the Type01 MPSS should also be updated to 256B first
on SoCs where the manual says it is writable from the internal bus, or if I'm
missing something here.
This is a very good point.
The R-Car S4 RM Rev.1.20 lists Type00 MPSS as Internal R and Type01 MPSS as Internal R/W. This was updated in RM Rev.1.30 to Type 00 Internal R/W and Type 01 Internal R. It is possible this change is going to be added into the V4H RM in the future too. That would likely imply, that Type01 MPSS is not programmable.
I don't think Type1 affects RC operation, but does it affect NTB ?
[...]
Thank you for your help!
--
Best regards,
Marek Vasut