Re: [PATCH 08/11] cxl/mem: Wire up event interrupts

From: Ira Weiny
Date: Wed Nov 30 2022 - 04:12:31 EST


On Wed, Nov 16, 2022 at 02:40:21PM +0000, Jonathan Cameron wrote:
> On Thu, 10 Nov 2022 10:57:55 -0800
> ira.weiny@xxxxxxxxx wrote:
>
> > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> >
> > CXL device events are signaled via interrupts. Each event log may have
> > a different interrupt message number. These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> >
> > Add interrupt support for event logs. Interrupts are allocated as
> > shared interrupts. Therefore, all or some event logs can share the same
> > message number.
> >
> > The driver must deal with the possibility that dynamic capacity is not
> > yet supported by a device it sees. Fallback and retry without dynamic
> > capacity if the first attempt fails.
> >
> > Device capacity event logs interrupt as part of the informational event
> > log. Check the event status to see which log has data.
> >
> > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> >
> Hi Ira,
>
> A few comments inline.

Thanks for the review!

>
> Thanks,
>
> Jonathan
>
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 879b228a98a0..1e6762af2a00 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
>
> > /**
> > * cxl_mem_get_event_records - Get Event Records from the device
> > @@ -867,6 +870,52 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > }
> > EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> >
> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds)
> > +{
> > + struct cxl_event_interrupt_policy *policy = &cxlds->evt_int_policy;
> > + size_t policy_size = sizeof(*policy);
> > + bool retry = true;
> > + int rc;
> > +
> > + policy->info_settings = CXL_INT_MSI_MSIX;
> > + policy->warn_settings = CXL_INT_MSI_MSIX;
> > + policy->failure_settings = CXL_INT_MSI_MSIX;
> > + policy->fatal_settings = CXL_INT_MSI_MSIX;
> > + policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
> > +
> > +again:
> > + rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > + policy, policy_size, NULL, 0);
> > + if (rc < 0) {
> > + /*
> > + * If the device does not support dynamic capacity it may fail
> > + * the command due to an invalid payload. Retry without
> > + * dynamic capacity.
> > + */
>
> There are a number of ways to discover if DCD is supported that aren't based
> on try and retry like this. 9.13.3 has "basic sequence to utilize Dynamic Capacity"
> That calls out:
> Verify the necessary Dynamic Capacity commands are returned in the CEL.
>
> First I'm not sure we should set the interrupt on for DCD until we have a lot
> more of the flow handled, secondly even then we should figure out if it is supported
> at a higher level than this command and pass that info down here.

I'm not sure I really agree. The events are just traced. I think this
functionality is really orthogonal to if any other support for DCD is there.

Regardless like I said in the call I think deferring this is the right way to
go for now.

>
>
> > + if (retry) {
> > + retry = false;
> > + policy->dyn_cap_settings = 0;
> > + policy_size = sizeof(*policy) - sizeof(policy->dyn_cap_settings);
> > + goto again;
> > + }
> > + dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > + rc);
> > + memset(policy, CXL_INT_NONE, sizeof(*policy));
>
> Relying on all the fields being 1 byte is a bit error prone. I'd just set them all
> individually in the interests of more readable code.

Done.

>
> > + return rc;
> > + }
> > +
> > + rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
> > + policy, policy_size);
>
> Add a comment on why you are reading this back (to get the msgnums in the upper
> bits) as it's not obvious to a casual reader.

Done.

>
> > + if (rc < 0) {
> > + dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > + rc);
> > + return rc;
> > + }
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
> > +
>
> ...
>
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index e0d511575b45..64b2e2671043 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -458,6 +458,138 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > cxlds->nr_irq_vecs = nvecs;
> > }
> >
> > +struct cxl_event_irq_id {
> > + struct cxl_dev_state *cxlds;
> > + u32 status;
> > + unsigned int msgnum;
> msgnum is only here for freeing the interrupt - I'd rather we fixed
> that by using standard infrastructure (or adding some - see below).
>
> status is an indirect way of allowing us to share an interrupt handler.
> You could do that by registering a trivial wrapper for each instead.
> Then all you have left is the cxl_dev_state which could be passed
> in directly as the callback parameter removing need to have this
> structure at all. I think that might be neater.

It does prevent the alloc of this structure which I like.

I've made the change.

>
> > +};
> > +
> > +static irqreturn_t cxl_event_int_thread(int irq, void *id)
> > +{
> > + struct cxl_event_irq_id *cxlid = id;
> > + struct cxl_dev_state *cxlds = cxlid->cxlds;
> > +
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_INFO)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_WARN)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_FAIL)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_FATAL)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > + if (cxlid->status & CXLDEV_EVENT_STATUS_DYNAMIC_CAP)
> > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_DYNAMIC_CAP);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t cxl_event_int_handler(int irq, void *id)
> > +{
> > + struct cxl_event_irq_id *cxlid = id;
> > + struct cxl_dev_state *cxlds = cxlid->cxlds;
> > + u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +
> > + if (cxlid->status & status)
> > + return IRQ_WAKE_THREAD;
> > + return IRQ_HANDLED;
>
> If status not set IRQ_NONE.
> Ah. I see Dave raised this as well.

Yep done.

>
> > +}
>
> ...
>
> > +static int cxl_request_event_irq(struct cxl_dev_state *cxlds,
> > + enum cxl_event_log_type log_type,
> > + u8 setting)
> > +{
> > + struct device *dev = cxlds->dev;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + struct cxl_event_irq_id *id;
> > + unsigned int msgnum = CXL_EVENT_INT_MSGNUM(setting);
> > + int irq;
> > +
> > + /* Disabled irq is not an error */
> > + if (!cxl_evt_int_is_msi(setting) || msgnum > cxlds->nr_irq_vecs) {
>
> I don't think that second condition can occur. The language under table 8-52
> (I think) means that it will move around if there aren't enough vectors
> (for MSI - MSI-X is more complex, but result the same).

Based on the other review this is just a bool msi_enabled which is used to
determine if this should be set up at all.

>
> > + dev_dbg(dev, "Event interrupt not enabled; %s %u %d\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_INFO),
> > + msgnum, cxlds->nr_irq_vecs);
> > + return 0;
> > + }
> > +
> > + id = devm_kzalloc(dev, sizeof(*id), GFP_KERNEL);
> > + if (!id)
> > + return -ENOMEM;
> > +
> > + id->cxlds = cxlds;
> > + id->msgnum = msgnum;
> > + id->status = log_type_to_status(log_type);
> > +
> > + irq = pci_request_irq(pdev, id->msgnum, cxl_event_int_handler,
> > + cxl_event_int_thread, id,
> > + "%s:event-log-%s", dev_name(dev),
> > + cxl_event_log_type_str(log_type));
> > + if (irq)
> > + return irq;
> > +
> > + devm_add_action_or_reset(dev, cxl_free_event_irq, id);
>
> Hmm. no pcim_request_irq() maybe this is the time to propose one
> (separate from this patch so we don't get delayed by that!)

Perhaps. But not tonight... ;-)

>
> We discussed this way back in DOE series (I'd forgotten but lore found
> it for me). There I suggested just calling
> devm_request_threaded_irq() directly as a work around.

Yea that works fine. One issue is we lose the format printing of the irq name:

...
29: ... PCI-MSI 100663300-edge 0000:c0:00.0:event-log-Fatal
30: ... PCI-MSI 100663301-edge 0000:c0:00.0:event-log-Failure
31: ... PCI-MSI 100663302-edge 0000:c0:00.0:event-log-Warning
32: ... PCI-MSI 100663303-edge 0000:c0:00.0:event-log-Informational
...

Thanks,
Ira

>
> > + return 0;
> > +}
> > +
> > +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > + struct device *dev = cxlds->dev;
> > + u8 setting;
> > +
> > + if (cxl_event_config_msgnums(cxlds))
> > + return;
> > +
> > + /*
> > + * Dynamic Capacity shares the info message number
> > + * Nothing to be done except check the status bit in the
> > + * irq thread.
> > + */
> > + setting = cxlds->evt_int_policy.info_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_INFO, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));
> > +
> > + setting = cxlds->evt_int_policy.warn_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_WARN, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_WARN));
> > +
> > + setting = cxlds->evt_int_policy.failure_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FAIL, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_FAIL));
> > +
> > + setting = cxlds->evt_int_policy.fatal_settings;
> > + if (cxl_request_event_irq(cxlds, CXL_EVENT_TYPE_FATAL, setting))
> > + dev_err(dev, "Failed to get interrupt for %s event log\n",
> > + cxl_event_log_type_str(CXL_EVENT_TYPE_FATAL));
> > +}
>