Re: [PATCH v13 20/25] CXL/PCI: Introduce CXL Port protocol error handlers

From: Bjorn Helgaas
Date: Mon Dec 08 2025 - 13:38:01 EST


On Mon, Nov 03, 2025 at 06:09:56PM -0600, Terry Bowman wrote:
> Add CXL protocol error handlers for CXL Port devices (Root Ports,
> Downstream Ports, and Upstream Ports). Implement cxl_port_cor_error_detected()
> and cxl_port_error_detected() to handle correctable and uncorrectable errors
> respectively.
>
> Introduce cxl_get_ras_base() to retrieve the cached RAS register base
> address for a given CXL port. This function supports CXL Root Ports,
> Downstream Ports, and Upstream Ports by returning their previously mapped
> RAS register addresses.
>
> Add device lock assertions to protect against concurrent device or RAS
> register removal during error handling. The port error handlers require
> two device locks:
>
> 1. The port's CXL parent device - RAS registers are mapped using devm_*
> functions with the parent port as the host. Locking the parent prevents
> the RAS registers from being unmapped during error handling.
>
> 2. The PCI device (pdev->dev) - Locking prevents concurrent modifications
> to the PCI device structure during error handling.
>
> The lock assertions added here will be satisfied by device locks introduced
> in a subsequent patch.

Weird. Can't you add the lock assertions at the same time you add the
locks?

> Introduce get_pci_cxl_host_dev() to return the device responsible for
> managing the RAS register mapping. This function increments the reference
> count on the host device to prevent premature resource release during error
> handling. The caller is responsible for decrementing the reference count.
> For CXL endpoints, which manage resources without a separate host device,
> this function returns NULL.
>
> Update the AER driver's is_cxl_error() to recognize CXL Port devices in
> addition to CXL Endpoints, as both now have CXL-specific error handlers.
>
> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>

Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

> @@ -1573,6 +1573,7 @@ static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
> return to_cxl_port(dev);
> return NULL;
> }
> +EXPORT_SYMBOL_NS_GPL(find_cxl_port_by_uport, "CXL");

The usual export question: is there a modular caller()?

> + dev_warn_once(dev, "Error: Unsupported device type (%X)", pci_pcie_type(pdev));

Maybe "%#x" (add 0x prefix and use lower-case hex, unless there's a
different CXL convention)?