Re: [PATCH v5 9/9] cxl/pci: Register for and process CPER events

From: Jonathan Cameron
Date: Mon Jan 08 2024 - 08:51:35 EST


On Wed, 20 Dec 2023 16:17:36 -0800
Ira Weiny <ira.weiny@xxxxxxxxx> wrote:

> If the firmware has configured CXL event support to be firmware first
> the OS can process those events through CPER records. The CXL layer has
> unique DPA to HPA knowledge and standard event trace parsing in place.
>
> CPER records contain Bus, Device, Function information which can be used
> to identify the PCI device which is sending the event.
>
> Change the PCI driver registration to include registration of a CXL
> CPER callback to process events through the trace subsystem.
>
> Use new scoped based management to simplify the handling of the PCI
> device object.
>
> NOTE this patch depends on Dan's addition of a device guard[1].
>
> [1] https://lore.kernel.org/all/170250854466.1522182.17555361077409628655.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/
>
One trivial comment inline.
The guard change Dan suggests makes sense. Otherwise I'm fine with this.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>

I'll bolt in the other stuff I need to test it from QEMU this week.
Did the protocol error first, but these are easy to add now I have
that working,

Jonathan
> ---
> Changes for v5:
> [Smita/djbw: trace a generic UUID if the type is unknown]
> [Jonathan: clean up pci and device state error handling]
> [iweiny: consolidate the trace function]
> ---
> drivers/cxl/core/mbox.c | 49 ++++++++++++++++++++++++++++-----------
> drivers/cxl/cxlmem.h | 4 ++++
> drivers/cxl/pci.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-
> include/linux/cxl-event.h | 1 +
> 4 files changed, 98 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 06957696247b..b801faaccd45 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -836,21 +836,44 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>
> -static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> - enum cxl_event_log_type type,
> - struct cxl_event_record_raw *record)
> +void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> + enum cxl_event_log_type type,
> + enum cxl_event_type event_type,
> + const uuid_t *uuid, union cxl_event *evt)
> {
> - union cxl_event *evt = &record->event;
> - uuid_t *id = &record->id;
> -
> - if (uuid_equal(id, &CXL_EVENT_GEN_MEDIA_UUID))
> + switch (event_type) {
> + case CXL_CPER_EVENT_GEN_MEDIA:
> trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> - else if (uuid_equal(id, &CXL_EVENT_DRAM_UUID))
> + break;

Might as well return directly and save a reviewer having to check if anything else happens
after the switch

> + case CXL_CPER_EVENT_DRAM:
> trace_cxl_dram(cxlmd, type, &evt->dram);
> - else if (uuid_equal(id, &CXL_EVENT_MEM_MODULE_UUID))
> + break;
> + case CXL_CPER_EVENT_MEM_MODULE:
> trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
> - else
> - trace_cxl_generic_event(cxlmd, type, id, &evt->generic);
> + break;
> + case CXL_CPER_EVENT_GENERIC:
> + default:
> + trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
> + break;
> + }
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);