Re: [RESEND v13 21/25] PCI/AER: Dequeue forwarded CXL error
From: dan.j.williams
Date: Wed Nov 19 2025 - 22:34:06 EST
Terry Bowman wrote:
> The AER driver now forwards CXL protocol errors to the CXL driver via a
> kfifo. The CXL driver must consume these work items, initiate protocol
> error handling, and ensure RAS mappings remain valid throughout processing.
>
> Implement cxl_proto_err_work_fn() to dequeue work items forwarded by the
> AER service driver and begin protocol error processing by calling
> cxl_handle_proto_error().
>
> Add a PCI device lock on &pdev->dev within cxl_proto_err_work_fn() to
> keep the PCI device structure valid during handling. Locking an Endpoint
> will also defer RAS unmapping until the device is unlocked.
>
> For Endpoints, add a lock on CXL memory device cxlds->dev. The CXL memory
> device structure holds the RAS register reference needed during error
> handling.
>
> Add lock for the parent CXL Port for Root Ports, Downstream Ports, and
> Upstream Ports to prevent destruction of structures holding mapped RAS
> addresses while they are in use.
>
> Invoke cxl_do_recovery() for uncorrectable errors. Treat this as a stub for
> now; implement its functionality in a future patch.
>
> Export pci_clean_device_status() to enable cleanup of AER status following
> error handling.
>
> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
>
> ---
> Changes in v12->v13:
> - Add cxlmd lock using guard() (Terry)
> - Remove exporting of unused function, pci_aer_clear_fatal_status() (Dave Jiang)
> - Change pr_err() calls to ratelimited. (Terry)
> - Update commit message. (Terry)
> - Remove namespace qualifier from pcie_clear_device_status()
> export (Dave Jiang)
> - Move locks into cxl_proto_err_work_fn() (Dave)
> - Update log messages in cxl_forward_error() (Ben)
>
> Changes in v11->v12:
> - Add guard for CE case in cxl_handle_proto_error() (Dave)
>
> Changes in v10->v11:
> - Reword patch commit message to remove RCiEP details (Jonathan)
> - Add #include <linux/bitfield.h> (Terry)
> - is_cxl_rcd() - Fix short comment message wrap (Jonathan)
> - is_cxl_rcd() - Combine return calls into 1 (Jonathan)
> - cxl_handle_proto_error() - Move comment earlier (Jonathan)
> - Use FIELD_GET() in discovering class code (Jonathan)
> - Remove BDF from cxl_proto_err_work_data. Use 'struct
> pci_dev *' (Dan)
> ---
> drivers/cxl/core/ras.c | 153 ++++++++++++++++++++++++++++++++++++++---
> drivers/pci/pci.c | 1 +
> drivers/pci/pci.h | 1 -
> include/linux/pci.h | 2 +
> 4 files changed, 145 insertions(+), 12 deletions(-)
[..]
> +static void cxl_proto_err_work_fn(struct work_struct *work)
> +{
> + struct cxl_proto_err_work_data wd;
> +
> + while (cxl_proto_err_kfifo_get(&wd)) {
> + struct pci_dev *pdev __free(pci_dev_put) = pci_dev_get(wd.pdev);
Why does this function need its own device reference? I think this
handler should match PCI AER semantics where the device validity is
caller guaranteed.
> + struct device *cxlmd_dev;
> +
> + if (!pdev) {
> + pr_err_ratelimited("NULL PCI device passed in AER-CXL KFIFO\n");
> + continue;
> + }
> +
> + guard(device)(&pdev->dev);
> + if (is_pcie_endpoint(pdev)) {
> + struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> +
> + if (!cxl_pci_drv_bound(pdev))
> + return;
> + cxlmd_dev = &cxlds->cxlmd->dev;
> + device_lock_if(cxlmd_dev, cxlmd_dev);
Ok, I think this demonstrates the problematic usage of
cxl_pci_drv_bound() and the presence of conditional locking is also a
tell that this is broken.
My expectation is the CXL protocol errors are exclusively reported to
cxl_ports. That means that all RAS register mapping must be exclusively
relative to cxl_port::probe() cxl_port::remove() lifetime. Once that is
in place this endpoint case melts away. The endpoint's job is to
register an endpoint-port to get protocol error services.
Given time is short for v6.19 I might take a quick stab at this to
demonstrate the proposal (or otherwise try to quickly discover why the
suggestion can not work).