Re: [PATCH 3/4] acpi/ghes, cxl/pci: Trace FW-First CXL Protocol Errors

From: Smita Koralahalli
Date: Thu May 23 2024 - 17:36:05 EST


On 5/22/2024 5:22 PM, Alison Schofield wrote:
On Wed, May 22, 2024 at 03:08:38PM +0000, Smita Koralahalli wrote:
When PCIe AER is in FW-First, OS should process CXL Protocol errors from
CPER records.

Reuse the existing work queue cxl_cper_work registered with GHES to notify
the CXL subsystem on a Protocol error.

The defined trace events cxl_aer_uncorrectable_error and
cxl_aer_correctable_error currently trace native CXL AER errors. Reuse
them to trace FW-First Protocol Errors.

Will the trace log differentiate between errors reported in FW-First
versus Native mode? Wondering if that bit of info needs to be logged
or is discoverable elsewhere.

No, the trace log won't differentiate currently.

But just a side note, FW-First also logs errors in dmesg. I'm not sure if going forward, we would still continue to log errors in dmesg. But I feel it might be needed so that we don't miss errors from RCH Downstream Port or hexdump of unrecognized agent types.

Thanks
Smita


Otherwise, LGTM,
Reviewed-by: Alison Schofield <alison.schofield@xxxxxxxxx>



Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@xxxxxxx>
---
drivers/acpi/apei/ghes.c | 14 ++++++++++++++
drivers/cxl/core/pci.c | 24 ++++++++++++++++++++++++
drivers/cxl/cxlpci.h | 3 +++
drivers/cxl/pci.c | 34 ++++++++++++++++++++++++++++++++--
include/linux/cxl-event.h | 1 +
5 files changed, 74 insertions(+), 2 deletions(-)

snip