Re: [PATCH 04/15] cxl/aer/pci: Add CXL PCIe port correctable error support in AER service driver

From: Jonathan Cameron
Date: Wed Oct 16 2024 - 12:22:50 EST


On Tue, 8 Oct 2024 17:16:46 -0500
Terry Bowman <terry.bowman@xxxxxxx> wrote:

> The AER service driver currently does not manage CXL PCIe port
> protocol errors reported by CXL root ports, CXL upstream switch ports,
> and CXL downstream switch ports. Consequently, RAS protocol errors
> from CXL PCIe port devices are not properly logged or handled.
>
> These errors are reported to the OS via the root port's AER correctable
> and uncorrectable internal error fields. While the AER driver supports
> handling downstream port protocol errors in restricted CXL host (RCH)
> mode also known as CXL1.1, it lacks the same functionality for CXL
> PCIe ports operating in virtual hierarchy (VH) mode, introduced in
> CXL2.0.
>
> To address this gap, update the AER driver to handle CXL PCIe port
> device protocol correctable errors (CE).
>
> The uncorrectable error handling (UCE) will be added in a future
> patch.
>
> Make this update alongside the existing downstream port RCH error
> handling logic, extending support to CXL PCIe ports in VH.
>
> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
Minor comments inline.

J
> ---
> drivers/pci/pcie/aer.c | 54 +++++++++++++++++++++++++++++++++---------
> 1 file changed, 43 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index dc8b17999001..1c996287d4ce 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -40,6 +40,8 @@
> #define AER_MAX_TYPEOF_COR_ERRS 16 /* as per PCI_ERR_COR_STATUS */
> #define AER_MAX_TYPEOF_UNCOR_ERRS 27 /* as per PCI_ERR_UNCOR_STATUS*/
>
> +#define CXL_DVSEC_PORT_EXTENSIONS 3

Duplicate of definition in drivers/cxl/cxlpci.h

Maybe wrap it up in an is_cxl_port() or similar? Or just
move that to a header both places can exercise.


> +
> struct aer_err_source {
> u32 status; /* PCI_ERR_ROOT_STATUS */
> u32 id; /* PCI_ERR_ROOT_ERR_SRC */
> @@ -941,6 +943,17 @@ static bool find_source_device(struct pci_dev *parent,
> return true;
> }
>
> +static bool is_pcie_cxl_port(struct pci_dev *dev)
> +{
> + if ((pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) &&
> + (pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM) &&
> + (pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM))
> + return false;
> +
> + return (!!pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
> + CXL_DVSEC_PORT_EXTENSIONS));

No need for the !! it will return the same without that clamping to 1/0
because any non 0 value is true.

> +}
> +