Re: [PATCH v5 9/9] cxl/pci: Register for and process CPER events

From: Dan Williams
Date: Tue Jan 09 2024 - 18:59:35 EST


Jonathan Cameron wrote:
> On Wed, 20 Dec 2023 16:17:36 -0800
> Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
>
> > If the firmware has configured CXL event support to be firmware first
> > the OS can process those events through CPER records. The CXL layer has
> > unique DPA to HPA knowledge and standard event trace parsing in place.
> >
> > CPER records contain Bus, Device, Function information which can be used
> > to identify the PCI device which is sending the event.
> >
> > Change the PCI driver registration to include registration of a CXL
> > CPER callback to process events through the trace subsystem.
> >
> > Use new scoped based management to simplify the handling of the PCI
> > device object.
> >
> > NOTE this patch depends on Dan's addition of a device guard[1].
> >
> > [1] https://lore.kernel.org/all/170250854466.1522182.17555361077409628655.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/
> >
> One trivial comment inline.
> The guard change Dan suggests makes sense. Otherwise I'm fine with this.
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>
> I'll bolt in the other stuff I need to test it from QEMU this week.
> Did the protocol error first, but these are easy to add now I have
> that working,
>
> Jonathan
> > ---
> > Changes for v5:
> > [Smita/djbw: trace a generic UUID if the type is unknown]
> > [Jonathan: clean up pci and device state error handling]
> > [iweiny: consolidate the trace function]
> > ---
> > drivers/cxl/core/mbox.c | 49 ++++++++++++++++++++++++++++-----------
> > drivers/cxl/cxlmem.h | 4 ++++
> > drivers/cxl/pci.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-
> > include/linux/cxl-event.h | 1 +
> > 4 files changed, 98 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 06957696247b..b801faaccd45 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -836,21 +836,44 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> > }
> > EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >
> > -static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> > - enum cxl_event_log_type type,
> > - struct cxl_event_record_raw *record)
> > +void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> > + enum cxl_event_log_type type,
> > + enum cxl_event_type event_type,
> > + const uuid_t *uuid, union cxl_event *evt)
> > {
> > - union cxl_event *evt = &record->event;
> > - uuid_t *id = &record->id;
> > -
> > - if (uuid_equal(id, &CXL_EVENT_GEN_MEDIA_UUID))
> > + switch (event_type) {
> > + case CXL_CPER_EVENT_GEN_MEDIA:
> > trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> > - else if (uuid_equal(id, &CXL_EVENT_DRAM_UUID))
> > + break;
>
> Might as well return directly and save a reviewer having to check if anything else happens
> after the switch

Might as well keep it as an "if () else" tree as that's equally clear
and more compact.

That immeidiately then opens the concern of why the upper level
__cxl_event_trace_record() is calling a lower level function without the
prefix. That can be swapped later to meet common expectations, but it
feels like gymnastics to parse all the uuids *and* still pass the uuid
to the cxl_event_trace_record() helper. Yes, I see how it happens, just
not totally comfortable with the result, but not enough to hold up the
series.