Re: [PATCH v16 01/10] PCI/AER: Introduce AER-CXL Kfifo

From: Jonathan Cameron

Date: Mon Mar 09 2026 - 08:20:45 EST


On Mon, 2 Mar 2026 14:36:39 -0600
Terry Bowman <terry.bowman@xxxxxxx> wrote:

> CXL virtual hierarchy (VH) RAS handling for CXL Port devices will be added
> soon. This requires a notification mechanism for the AER driver to share
> the AER interrupt with the CXL driver. The notification will be used as an
> indication for the CXL drivers to handle and log the CXL RAS errors.
>
> Note, 'CXL protocol error' terminology will refer to CXL VH and not
> CXL RCH errors unless specifically noted going forward.
>
> Introduce a new file in the AER driver to handle the CXL protocol errors
> named pci/pcie/aer_cxl_vh.c.
>
> Add a kfifo work queue to be used by the AER and CXL drivers. The AER
> driver will be the sole kfifo producer adding work and the cxl_core will be
> the sole kfifo consumer removing work. Add the boilerplate kfifo support.
> Encapsulate the kfifo, RW semaphore, and work pointer in a single structure.
>
> Add CXL work queue handler registration functions in the AER driver. Export
> the functions allowing CXL driver to access. Implement registration
> functions for the CXL driver to assign or clear the work handler function.
>
> Introduce 'struct cxl_proto_err_work_data' to serve as the kfifo work data.
> This will contain a reference to the PCI error source device and the error
> severity. This will be used when the work is dequeued by the cxl_core driver.
>
> Introduce cxl_forward_error() to take a given CXL protocol error and add it
> to a work structure before pushing onto the AER-CXL kfifo. This function
> takes a reference count increment of the PCI device. The kfifo consumer is
> responsible for reference decrementing. If there is an error on adding the
> work then this function must decrement the reference count.
>
> Synchronize accesses to the work function pointer during registration,
> deregistration, enqueue, and dequeue. Further synchronization fixes will
> be added in the following patch.
>
> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
> Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>
>
Hi Terry,

Just some include related comments. There are a few missing that
should be included from the c file.

Thanks,

Jonathan


> diff --git a/drivers/pci/pcie/aer_cxl_vh.c b/drivers/pci/pcie/aer_cxl_vh.c
> new file mode 100644
> index 000000000000..7e2bc1894395
> --- /dev/null
> +++ b/drivers/pci/pcie/aer_cxl_vh.c
> @@ -0,0 +1,87 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2025 AMD Corporation. All rights reserved. */
> +
> +#include <linux/types.h>
> +#include <linux/kfifo.h>
> +#include <linux/aer.h>

What motivated the decisions on which includes to have here?
Generally follow include what you use, with exceptions for some
cases where there is something inherent that means a header will
always include another one.

I'd definitely expect rwsem.h and cleanup.h (for guard()) for instance.
Also workqueue.h for schedule_work()


> +#include "../pci.h"
> +#include "portdrv.h"
> +
> +#define CXL_ERROR_SOURCES_MAX 128
> +
> +struct cxl_proto_err_kfifo {
> + struct work_struct *work;
> + struct rw_semaphore rwsema;
> + DECLARE_KFIFO(fifo, struct cxl_proto_err_work_data,
> + CXL_ERROR_SOURCES_MAX);
> +};
> +
> +static struct cxl_proto_err_kfifo cxl_proto_err_kfifo = {
> + .rwsema = __RWSEM_INITIALIZER(cxl_proto_err_kfifo.rwsema)
> +};