Re: [RFC PATCH 6/9] cxl/pci: Add trace logging for CXL PCIe port RAS errors

From: Terry Bowman
Date: Mon Jun 24 2024 - 11:54:06 EST


Hi Jonathan,

I added responses inline below.

On 6/20/24 07:53, Jonathan Cameron wrote:
> On Mon, 17 Jun 2024 15:04:08 -0500
> Terry Bowman <terry.bowman@xxxxxxx> wrote:
>
>> The cxl_pci driver uses kernel trace functions to log RAS errors for
>> endpoints and RCH downstream ports. The same is needed for CXL root ports,
>> CXL downstream switch ports, and CXL upstream switch ports.
>>
>> Add RAS correctable and RAS uncorrectable trace logging functions for
>> CXL PCIE ports.
>>
>> Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
>> ---
>> drivers/cxl/core/trace.h | 34 ++++++++++++++++++++++++++++++++++
>> 1 file changed, 34 insertions(+)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index e5f13260fc52..5cfd9952d88a 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -48,6 +48,23 @@
>> { CXL_RAS_UC_IDE_RX_ERR, "IDE Rx Error" } \
>> )
>>
>> +TRACE_EVENT(cxl_port_aer_uncorrectable_error,
>> + TP_PROTO(struct device *dev, u32 status),
>
> By comparison with existing code, why no fe or header
> log? Don't exist for ports for some reason?
> Serial number of the port might also be useful.
>

The AER FE and header are the same for ports and the logging
needs to be added here.

There is no serial number for the ports.

Regards,
Terry

>> + TP_ARGS(dev, status),
>> + TP_STRUCT__entry(
>> + __string(devname, dev_name(dev))
>> + __field(u32, status)
>> + ),
>> + TP_fast_assign(
>> + __assign_str(devname, dev_name(dev));
>> + __entry->status = status;
>> + ),
>> + TP_printk("device=%s status='%s'",
>> + __get_str(devname),
>> + show_uc_errs(__entry->status)
>> + )
>> +);
>> +
>> TRACE_EVENT(cxl_aer_uncorrectable_error,
>> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status, u32 fe, u32 *hl),
>> TP_ARGS(cxlmd, status, fe, hl),
>> @@ -96,6 +113,23 @@ TRACE_EVENT(cxl_aer_uncorrectable_error,
>> { CXL_RAS_CE_PHYS_LAYER_ERR, "Received Error From Physical Layer" } \
>> )
>>
>> +TRACE_EVENT(cxl_port_aer_correctable_error,
>> + TP_PROTO(struct device *dev, u32 status),
>> + TP_ARGS(dev, status),
>> + TP_STRUCT__entry(
>> + __string(devname, dev_name(dev))
>> + __field(u32, status)
>> + ),
>> + TP_fast_assign(
>> + __assign_str(devname, dev_name(dev));
>> + __entry->status = status;
>> + ),
>> + TP_printk("device=%s status='%s'",
>> + __get_str(devname),
>> + show_ce_errs(__entry->status)
>> + )
>> +);
>> +
>> TRACE_EVENT(cxl_aer_correctable_error,
>> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status),
>> TP_ARGS(cxlmd, status),
>