[RESEND] Handle MPS mismatch for Switch Downstream Ports

From: Devilliv Kelly

Date: Tue Mar 31 2026 - 00:12:56 EST


Background
===========
Commit 9f0e89359775 ("PCI: Match Root Port's MPS to endpoint's MPSS as
necessary") added logic to reduce a Root Port's MPS when an endpoint's
MPSS is smaller than the Root Port's current MPS setting. This ensures
hot-added devices can work correctly.

However, this logic only applies to ROOT_PORT type bridges:

mpss = 128 << dev->pcie_mpss;
if (mpss < p_mps && pci_pcie_type(bridge) == PCI_EXP_TYPE_ROOT_PORT) {
pcie_set_mps(bridge, mpss);
...
}

This leaves Switch Downstream Ports unhandled, which can cause issues
when the Switch reports an incorrect or unexpected MPS value after
secondary bus reset.

Problem Description
===================
We encountered a scenario where a PCIe Switch Downstream Port reports
an MPS value larger than what the endpoint can support:

Topology:
16:00.0 - Switch Upstream Port (MPS = 512 bytes, correct)
└── 17:00.0 - Switch Downstream Port (MPS = 2048 bytes after secondary bus reset)
└── 18:00.0 - Endpoint device (DevCap MaxPayload = 512 bytes)

After a secondary bus reset, the Switch Downstream Port's MPS unexpectedly
became 2048 bytes. When the kernel enumerates the endpoint device (18:00.0),
it attempts to set the endpoint's MPS to 2048 to match the upstream bridge,
but this fails because the endpoint only supports a maximum of 512 bytes.

Kernel log shows:
pci 0000:18:00.0: can't set Max Payload Size to 2048; if necessary,
use "pci=pcie_bus_safe" and report a bug

This results in NMI errors when the endpoint attempts DMA transactions:
Uhhuh. NMI received for unknown reason 2c on CPU 0.
Dazed and confused, but trying to continue

Root Cause
==========
The pci_configure_mps() function only adjusts the upstream bridge's MPS
when the bridge is a ROOT_PORT. For DOWNSTREAM_PORT types (Switch ports),
the kernel attempts to set the endpoint's MPS to the bridge's value
without checking if the endpoint can support it.

While the Switch firmware should ideally configure correct MPS values,
the kernel should be robust enough to handle such cases and ensure
proper MPS configuration for reliable operation.

Current Behavior
================
1. Endpoint's MPSS < Bridge's MPS
2. Bridge is DOWNSTREAM_PORT (not ROOT_PORT)
3. Kernel skips bridge MPS adjustment
4. pcie_set_mps(dev, p_mps) fails because p_mps > dev's capability
5. Device may not function correctly

Workaround
==========
The issue can be worked around by using the kernel parameter:
pci=pcie_bus_safe

However, this affects the entire system and may reduce performance
for other devices.

Questions for Discussion
========================
1. Was there a specific reason for restricting this logic to ROOT_PORT
only? The commit message mentions avoiding impact on "other unrelated
sub-topologies," but Switch Downstream Ports typically only have one
endpoint below them.

2. Should we also consider propagating MPS changes up through multiple
Switch levels in the hierarchy?

References
==========
- Commit 9f0e89359775: PCI: Match Root Port's MPS to endpoint's MPSS as necessary
- Commit 27d868b5e6cf: PCI: Set MPS to match upstream bridge
- https://bugzilla.kernel.org/show_bug.cgi?id=200527 (original ROOT_PORT fix)

Kelly Devilliv