Re: [PATCH v13 16/25] CXL/AER: Introduce pcie/aer_cxl_vh.c in AER driver for forwarding CXL errors
From: Bjorn Helgaas
Date: Mon Dec 08 2025 - 13:36:37 EST
Maybe:
PCI/AER: Add CXL error forwarding in aer_cxl_vh.c
On Mon, Nov 03, 2025 at 06:09:52PM -0600, Terry Bowman wrote:
> CXL virtual hierarchy (VH) RAS handling for CXL Port devices will be added
> soon. This requires a notification mechanism for the AER driver to share
> the AER interrupt with the CXL driver. The notification will be used as an
> indication for the CXL drivers to handle and log the CXL RAS errors.
>
> Note, 'CXL protocol error' terminology will refer to CXL VH and not
> CXL RCH errors unless specifically noted going forward.
>
> Introduce a new file in the AER driver to handle the CXL protocol errors
> named pci/pcie/aer_cxl_vh.c.
>
> Add a kfifo work queue to be used by the AER and CXL drivers. The AER
> driver will be the sole kfifo producer adding work and the cxl_core will be
> the sole kfifo consumer removing work. Add the boilerplate kfifo support.
> Encapsulate the kfifo, RW semaphore, and work pointer in a single structure.
>
> Add CXL work queue handler registration functions in the AER driver. Export
> the functions allowing CXL driver to access. Implement registration
> functions for the CXL driver to assign or clear the work handler function.
> Synchronize accesses using the RW semaphore.
>
> Introduce 'struct cxl_proto_err_work_data' to serve as the kfifo work data.
> This will contain a reference to the erring PCI device and the error
> severity. This will be used when the work is dequeued by the cxl_core driver.
s/erring PCI device/PCI error source device/
> +bool cxl_error_is_native(struct pci_dev *dev)
> +{
> + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
> +
> + return (pcie_ports_native || host->native_aer);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_error_is_native, "CXL");
I don't see modular callers of any of these that would require
EXPORT().
> +++ b/include/linux/aer.h
> @@ -10,6 +10,7 @@
>
> #include <linux/errno.h>
> #include <linux/types.h>
> +#include <linux/workqueue_types.h>
Looks like "struct work_struct;" would be sufficient without including
linux/workqueue_types.h.
> +struct cxl_proto_err_work_data {
> + int severity;
> + struct pci_dev *pdev;
Is there a reason to order them this way? I would have put the pdev
pointer first because it's the more general part and might result in
better alignment in memory.