Re: [PATCH 5/6] firmware/efi: Process CXL Component Events

From: Dan Williams
Date: Fri Dec 08 2023 - 16:39:35 EST


Ira Weiny wrote:
> Dan Williams wrote:
> > Ira Weiny wrote:
>
> [snip]
>
> > > +
> > > +int register_cxl_cper_notifier(struct notifier_block *nb)
> > > +{
> > > + return blocking_notifier_chain_register(&cxl_cper_chain_head, nb);
> > > +}
> > > +EXPORT_SYMBOL_NS_GPL(register_cxl_cper_notifier, CXL);
> > > +
> > > +void unregister_cxl_cper_notifier(struct notifier_block *nb)
> > > +{
> > > + blocking_notifier_chain_unregister(&cxl_cper_chain_head, nb);
> > > +}
> > > +EXPORT_SYMBOL_NS_GPL(unregister_cxl_cper_notifier, CXL);
> >
> > So I am struggling with why is this a notifier chain vs something
> > simpler and more explicit, something like:
> >
> > typedef (int)(*cxl_cper_event_fn)(struct cper_cxl_event_rec *rec)
> >
> > int register_cxl_cper(cxl_cper_event_fn func)
> > {
> > guard(rwsem_write)(cxl_cper_rwsem);
> > if (cxl_cper_event)
> > return -EBUSY;
> > cxl_cper_event = func;
> > return 0;
> > }
>
> This is easier in the register code but then the CXL code must create a
> loop over the available memdevs to match the incoming CPER record.
>
> By allowing each memdev to register their own callback they each get
> called and match the CPER record to themselves.
>
> >
> > ...do the reverse on unregister and hold the rwsem for read while
> > invoking to hold off unregistration while event processing is in flight.
> >
> > There are a couple properties of a blocking notifier chain that are
> > unwanted: chaining, only the CXL subsystem cares about seeing these
> > records,
>
> True but there are multiple memdev driver instances which care. It is not
> just 1 entity which cares about these.
>
> > and loss of type-safety, no need to redirect through a (void *)
> > payload compared to a direct call. Overall makes the implementation more
> > explicit.
>
> Let me see how it works out with your comments on the final patch. But
> the additional chain state of the notifier made this much easier in my
> head. IOW chain up any memdev which wants these notifiers.

AFAICS, the loop over memdevs is obviated by the call to pci_get_domain_bus_and_slot().

The cxl_pci driver switches from using module_pci_driver() to having a
custom module_init(). Inside that module_init() it does the registration
once with the CPER code. In the registered callback it does:

guard(mutex)(&pdev->dev->mutex);
if (pdev->driver == &cxl_pci_driver)
cxlds = pci_dev_get_drvdata(pdev);
if (!cxlds)
return -ENXIO;
cxl_event_trace_record(cxlds->cxlmd, log_type, event_type, &rec->event);

...no need for per-instance registration.

> > > diff --git a/drivers/firmware/efi/cper_cxl.h b/drivers/firmware/efi/cper_cxl.h
> > > index 86bfcf7909ec..aa3d36493586 100644
> > > --- a/drivers/firmware/efi/cper_cxl.h
> > > +++ b/drivers/firmware/efi/cper_cxl.h
> > > @@ -10,11 +10,38 @@
> > > #ifndef LINUX_CPER_CXL_H
> > > #define LINUX_CPER_CXL_H
> > >
> > > +#include <linux/cxl-event.h>
> > > +
> > > /* CXL Protocol Error Section */
> > > #define CPER_SEC_CXL_PROT_ERR \
>
> FYI this is not added code.

Yes, but I reacted to seeing enums and GUIDs with similar names and
more explicit details in the naming.

>
> > > GUID_INIT(0x80B9EFB4, 0x52B5, 0x4DE3, 0xA7, 0x77, 0x68, 0x78, \
> > > 0x4B, 0x77, 0x10, 0x48)
> >
> > I like these defines, I notice that mbox.c uses "static const"
> > defintions for something similar. Perhaps unify on the #define method?
>
> The static const's are defined such that they can be passed to the trace
> code as a reference without the creation of a temp variable. These only
> need to be used as static data.

ok.

>
> > I
> > think this define also wants a _GUID suffix to reduce potential
> > confusion between the _UUID variant and the cxl_event_log_type
> > definitions?
>
> The UUID's are never defined as a macro. I also followed the current
> convention here of prefixing 'CPER_SEC_' as per CPER_SEC_CXL_PROT_ERR.

Once those definitions start being reused in code outside of CPER where
it is not clear what a CPER_SEC is, I think the _GUID helps.

The UUIDs as macros would cleanup the open coded UUID_INIT in cxl_test.

>
> >
> > >
> > > +/* CXL Event record UUIDs are formated at GUIDs and reported in section type */
> > > +/*
> > > + * General Media Event Record
> > > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > > + */
> > > +#define CPER_SEC_CXL_GEN_MEDIA \
> > > + GUID_INIT(0xfbcd0a77, 0xc260, 0x417f, \
> > > + 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6)
> > > +
>
> 'CPER_SEC_' is in my mind different from an actual '*CPER_EVENT*'.
>
> But I can see how the macro/enums are similar enough to have confused
> you.
>
> I can append _GUID if you really want.

Yes, please.