[PATCH v2] cxl/core/port: defer endpoint probes when ACPI likely hasn't finished

From: Gregory Price
Date: Fri Oct 04 2024 - 17:26:20 EST


In cxl_acpi_probe, we add dports and uports to host bridges iteratively:
- bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_dport);
- bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_uport);

Simultaneously, as ports are probed, memdev endpoints can also be
probed. This creates a race condition, where an endpoint can perceive
its path to the root being broken in devm_cxl_enumerate_ports.

The memdev/endpoint probe will see a heirarchy that may look like:
mem1
parent => 0000:c1:00.0
parent => 0000:c0:01.1
parent->parent => NULL

This results in find_cxl_port() returning NULL (since the port hasn't
been associated with the host bridge yet), and add_port_attach_ep
fails because the grandparent's grandparent is NULL.

When the latter condition is detected, the comments suggest:
/*
* The iteration reached the topology root without finding the
* CXL-root 'cxl_port' on a previous iteration, fail for now to
* be re-probed after platform driver attaches.
*/

This case results in an -ENXIO; however, a re-probe never occurs. Change
this return condition to -EPROBE_DEFER to explicitly cause a reprobe.

(v2: additional debug information)

Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
---
drivers/cxl/core/port.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 1d5007e3795a..d6bebf70d142 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
*/
dev_dbg(&cxlmd->dev, "%s is a root dport\n",
dev_name(dport_dev));
- return -ENXIO;
+ return -EPROBE_DEFER;
}

parent_port = find_cxl_port(dparent, &parent_dport);
--
2.43.0