Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
From: Koichiro Den
Date: Tue Apr 28 2026 - 03:00:37 EST
On Sun, Apr 26, 2026 at 01:38:28AM +0200, Marek Vasut wrote:
> R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
> maximum payload size. The PCIe DMA generates requests of size up
> to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
> both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
> propagate this limit to all downstream devices.
>
> This limitation can be triggered for example by using an NVMe SSD
> which does not use host memory buffer, Samsung 980 PRO is such an
> SSD. Affected SSD reports 'hmpre' field as 0:
> "
> $ nvme id-ctrl /dev/nvme0 | grep hmpre
> hmpre : 0
> "
>
> The symptom is a read from the SSD which wraps around at 256 Byte
> boundary. The test for this symptom can be implemented by writing
> 512 Byte of random data into the SSD and reading the data back. If
> the read back data repeat after 256 Bytes, the device is affected.
> "
> $ dd if=/dev/urandom of=/tmp/data.bin bs=256 count=2 \
> dd if=/tmp/data.bin of=/dev/nvme0n1 bs=256 count=2 \
> dd if=/dev/nvme0n1 bs=256 count=2 of=/tmp/readback.bin
> "
>
> Expected data:
> "
> $ hexdump -vC /tmp/data.bin
> 00000000 97 81 b7 3b 0e 38 2b 4d a7 d3 e0 47 ff c2 4b ca
> 00000010 c1 85 98 f0 4a ac 03 a0 3b ab f3 19 44 dd 06 8b
> ...
> 00000100 7a ce 3c b2 e1 d5 d9 11 88 63 10 59 76 3c dc 32 <-- random
> 00000110 72 32 2a 7d a3 e1 aa 13 7c da 58 a1 7b 21 11 50 <-- data
> "
>
> Faulty readback, collected without this change in place:
> "
> $ hexdump -vC /tmp/readback.bin
> 00000000 97 81 b7 3b 0e 38 2b 4d a7 d3 e0 47 ff c2 4b ca <---.
> 00000010 c1 85 98 f0 4a ac 03 a0 3b ab f3 19 44 dd 06 8b <-. |
> ... | |
> 00000100 97 81 b7 3b 0e 38 2b 4d a7 d3 e0 47 ff c2 4b ca <-:-+- repeated
> 00000110 c1 85 98 f0 4a ac 03 a0 3b ab f3 19 44 dd 06 8b <-+--- data
> ^^^
> |
> '--- Repeat starts at offset 0x100 = 256 Bytes
> "
>
> Fixes: 0d0c551011df ("PCI: rcar-gen4: Add R-Car Gen4 PCIe controller support for host mode")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Marek Vasut <marek.vasut+renesas@xxxxxxxxxxx>
> ---
> Cc: "Krzysztof Wilczyński" <kwilczynski@xxxxxxxxxx>
> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Cc: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
> Cc: Koichiro Den <den@xxxxxxxxxxxxx>
> Cc: Lorenzo Pieralisi <lpieralisi@xxxxxxxxxx>
> Cc: Magnus Damm <magnus.damm@xxxxxxxxx>
> Cc: Manivannan Sadhasivam <mani@xxxxxxxxxx>
> Cc: Rob Herring <robh@xxxxxxxxxx>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@xxxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: linux-pci@xxxxxxxxxxxxxxx
> Cc: linux-renesas-soc@xxxxxxxxxxxxxxx
> ---
> drivers/pci/controller/dwc/pcie-rcar-gen4.c | 56 +++++++++++++++++++++
> 1 file changed, 56 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> index 8b03c42f8c84c..82f0a074a71da 100644
> --- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> @@ -576,6 +576,7 @@ static int r8a779f0_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable)
> static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
> {
> struct dw_pcie *dw = &rcar->dw;
> + u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
> u32 val;
>
> val = dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW);
> @@ -584,11 +585,66 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
> val |= BIT(6);
> dw_pcie_writel_dbi(dw, PCIE_PORT_LANE_SKEW, val);
>
> + val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
> + val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
> + val |= PCI_EXP_DEVCTL_PAYLOAD_256B | PCI_EXP_DEVCTL_READRQ_256B;
> + dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
> +
> val = readl(rcar->base + PCIEPWRMNGCTRL);
> val |= APP_CLK_REQ_N | APP_CLK_PM_EN;
> writel(val, rcar->base + PCIEPWRMNGCTRL);
> }
Hello Marek,
The patch makes sense to me. Let me ask two questions:
1. Could r8a779f0 (R-Car S4-8) be handled as well, perhaps by adding a separate
.additional_common_init() implementation for it?
As far as I can see, the r8a779f0 match data currently does not use
rcar_gen4_pcie_additional_common_init().
2. Did you also happen to test V4H/V4M in endpoint (EP) mode, with the local
eDMA engine issuing MRd requests toward host memory? Your commit message
describes an NVMe device as the requester, but I'm wondering whether the same
256B limit was also verified for the R-Car EP DMA requester path.
(*) The background for my question 2:
I only have access to S4 Spider boards. In my RC <-> EP setup, where the EP
side uses the local eDMA engine to issue MRd requests toward the RC, 256-byte
MRd requests still appear to corrupt the transferred data. With the following
change on top of your patch, my DMA-read tests become stable:
---8<-----8<---
diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
index 82f0a074a71d..6910b9cd9d7b 100644
--- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
+++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
@@ -595,6 +595,18 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
writel(val, rcar->base + PCIEPWRMNGCTRL);
}
+static void r8a779f0_additional_common_init(struct rcar_gen4_pcie *rcar)
+{
+ struct dw_pcie *dw = &rcar->dw;
+ u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
+ u32 val;
+
+ val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
+ val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
+ val |= PCI_EXP_DEVCTL_PAYLOAD_128B | PCI_EXP_DEVCTL_READRQ_128B;
+ dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
+}
+
static void rcar_gen4_rc_pcie_quirk(struct pci_dev *dev)
{
static const struct pci_device_id rcar_gen4_pcie_rc_devid = {
@@ -796,11 +808,13 @@ static int rcar_gen4_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable
}
static struct rcar_gen4_pcie_drvdata drvdata_r8a779f0_pcie = {
+ .additional_common_init = r8a779f0_additional_common_init,
.ltssm_control = r8a779f0_pcie_ltssm_control,
.mode = DW_PCIE_RC_TYPE,
};
static struct rcar_gen4_pcie_drvdata drvdata_r8a779f0_pcie_ep = {
+ .additional_common_init = r8a779f0_additional_common_init,
.ltssm_control = r8a779f0_pcie_ltssm_control,
.mode = DW_PCIE_EP_TYPE,
};
---8<-----8<---
One detail which might be important is that limiting only MPS does not appear
to be sufficient in my setup. MPS=128B with MRRS=256B still seems broken,
while MPS=128B with MRRS=128B works fine. I wonder whether this is because
the "MPS" term in the min(MRRS, MPS) limit for DMA read transfers may
effectively be tied to the DMA read buffer segment size / MPSS rather than
only to DevCtl.MPS. I'm not sure about this yet though.
One more thing I noticed in the manuals:
R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025:
Type00 MPSS initial = 256B, PCI R, Internal R/W
Type01 MPSS initial = 128B, PCI R, Internal R
R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025
Type00 MPSS initial = 256B, PCI R, Internal R
Type01 MPSS initial = 128B, PCI R, Internal R/W
I'm still unsure, but this difference might be relevant. In particular, in
V4H/V4M RC mode your patch programs DevCtl.MPS to 256B, but does not change
Type01 MPSS. I wonder if the Type01 MPSS should also be updated to 256B first
on SoCs where the manual says it is writable from the internal bus, or if I'm
missing something here.
Best regards,
Koichiro
>
> +static void rcar_gen4_rc_pcie_quirk(struct pci_dev *dev)
> +{
> + static const struct pci_device_id rcar_gen4_pcie_rc_devid = {
> + PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0030),
> + .class = PCI_CLASS_BRIDGE_PCI_NORMAL, .class_mask = ~0
> + };
> + struct pci_bus *bus = dev->bus;
> + struct pci_dev *bridge;
> +
> + if (pci_is_root_bus(bus))
> + bridge = dev;
> +
> + /* Look for the host bridge */
> + while (!pci_is_root_bus(bus)) {
> + bridge = bus->self;
> + bus = bus->parent;
> + }
> +
> + if (!bridge)
> + return;
> +
> + if (!pci_match_one_device(&rcar_gen4_pcie_rc_devid, bridge))
> + return;
> +
> + /*
> + * R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
> + * maximum payload size. The PCIe DMA generates requests of size up
> + * to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
> + * both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
> + * propagate this limit to all downstream devices.
> + *
> + * For details, refer to:
> + * R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025 or
> + * R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 or
> + * R-Car V4M R19UH0217EJ0100 Rev.1.00 Dec. 12, 2025,
> + * chapters 104.1.1 Features and 104.3.9 DMA Transfer
> + * section DMA Read Transfer.
> + */
> + if (pcie_get_readrq(dev) > 256) {
> + dev_info(&dev->dev, "Limiting MRRS to 256 bytes\n");
> + pcie_set_readrq(dev, 256);
> + }
> +
> + if (pcie_get_mps(dev) > 256) {
> + dev_info(&dev->dev, "Limiting MPS to 256 bytes\n");
> + pcie_set_mps(dev, 256);
> + }
> +}
> +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, rcar_gen4_rc_pcie_quirk);
> +
> static void rcar_gen4_pcie_phy_reg_update_bits(struct rcar_gen4_pcie *rcar,
> u32 offset, u32 mask, u32 val)
> {
> --
> 2.53.0
>