Re: [RFC PATCH 6/9] cxl/pci: Add trace logging for CXL PCIe port RAS errors
From: Jonathan Cameron
Date: Thu Jun 20 2024 - 08:53:43 EST
On Mon, 17 Jun 2024 15:04:08 -0500
Terry Bowman <terry.bowman@xxxxxxx> wrote:
> The cxl_pci driver uses kernel trace functions to log RAS errors for
> endpoints and RCH downstream ports. The same is needed for CXL root ports,
> CXL downstream switch ports, and CXL upstream switch ports.
>
> Add RAS correctable and RAS uncorrectable trace logging functions for
> CXL PCIE ports.
>
> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
> ---
> drivers/cxl/core/trace.h | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..5cfd9952d88a 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -48,6 +48,23 @@
> { CXL_RAS_UC_IDE_RX_ERR, "IDE Rx Error" } \
> )
>
> +TRACE_EVENT(cxl_port_aer_uncorrectable_error,
> + TP_PROTO(struct device *dev, u32 status),
By comparison with existing code, why no fe or header
log? Don't exist for ports for some reason?
Serial number of the port might also be useful.
> + TP_ARGS(dev, status),
> + TP_STRUCT__entry(
> + __string(devname, dev_name(dev))
> + __field(u32, status)
> + ),
> + TP_fast_assign(
> + __assign_str(devname, dev_name(dev));
> + __entry->status = status;
> + ),
> + TP_printk("device=%s status='%s'",
> + __get_str(devname),
> + show_uc_errs(__entry->status)
> + )
> +);
> +
> TRACE_EVENT(cxl_aer_uncorrectable_error,
> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status, u32 fe, u32 *hl),
> TP_ARGS(cxlmd, status, fe, hl),
> @@ -96,6 +113,23 @@ TRACE_EVENT(cxl_aer_uncorrectable_error,
> { CXL_RAS_CE_PHYS_LAYER_ERR, "Received Error From Physical Layer" } \
> )
>
> +TRACE_EVENT(cxl_port_aer_correctable_error,
> + TP_PROTO(struct device *dev, u32 status),
> + TP_ARGS(dev, status),
> + TP_STRUCT__entry(
> + __string(devname, dev_name(dev))
> + __field(u32, status)
> + ),
> + TP_fast_assign(
> + __assign_str(devname, dev_name(dev));
> + __entry->status = status;
> + ),
> + TP_printk("device=%s status='%s'",
> + __get_str(devname),
> + show_ce_errs(__entry->status)
> + )
> +);
> +
> TRACE_EVENT(cxl_aer_correctable_error,
> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status),
> TP_ARGS(cxlmd, status),