Re: [PATCH 2/3] pci: Clamp pcie_set_readrq() when using "performance" settings

From: Benjamin LaHaise
Date: Tue Oct 04 2011 - 10:42:17 EST

On Mon, Oct 03, 2011 at 04:55:48PM -0500, Jon Mason wrote:
> From: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> When configuring the PCIe settings for "performance", we allow parents
> to have a larger Max Payload Size than children and rely on children
> Max Read Request Size to not be larger than their own MPS to avoid
> having the host bridge generate responses they can't cope with.

I'm pretty sure that simply will not work, and is an incorrect understanding
of how PCIe bridges and devices interact with regards to transaction size
limits. Here's why: I am actually implementing a PCIe nic on an FPGA at
present, and have just been in the process of tuning how memory read
requests are issued and processed. It is perfectly valid for a PCIe
endpoint to issue a read request for an entire 4KB block (assuming it
respects the no 4KB boundary crossings rule), even when the MPS setting
is only 64 or 128 bytes. However, the root complex or PCIe bridge *must
not* exceed the Maximum Payload Size for any completions with data or
posted writes. Multiple completions are okay and expected for read
requests. If the MPS on the bridge is set to a larger value than
what all of the endpoints connected to it, the bridge or root complex will
happily send read completions exceeding the endpoint's MPS. This can and
will lead to failure on the parts of endpoints.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at