Re: [PATCH V12 6/9] cxl/port: Read CDAT table
From: Ira Weiny
Date: Wed Jun 29 2022 - 23:36:01 EST
On Tue, Jun 28, 2022 at 03:47:27PM +0100, Jonathan Cameron wrote:
> On Mon, 27 Jun 2022 21:15:24 -0700
> ira.weiny@xxxxxxxxx wrote:
>
> > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> >
> > The OS will need CDAT data from CXL devices to properly set up
> > interleave sets. Currently this is supported through a DOE mailbox
> > which supports CDAT.
> >
> > Search the DOE mailboxes available, query CDAT data, and cache the data
> > for later parsing.
> >
> > Provide a sysfs binary attribute to allow dumping of the CDAT.
> >
> > Binary dumping is modeled on /sys/firmware/ACPI/tables/
> >
> > The ability to dump this table will be very useful for emulation of real
> > devices once they become available as QEMU CXL type 3 device emulation will
> > be able to load this file in.
> >
> > This does not support table updates at runtime. It will always provide
> > whatever was there when first cached. Handling of table updates can be
> > implemented later.
> >
> > Finally create a complete list of CDAT defines within cdat.h for code
> > wishing to decode the CDAT table.
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
>
> One query inline, though I'm fine with it either way, just want to
> understand your logic in keeping completion when Dan suggested using
> flush_work() to achieve the same thing.
>
> >
> > ---
> > Changes from V11:
> > Adjust for the use of DOE mailbox xarray
> > Dan Williams:
> > Remove unnecessary get/put device
> > Use new BIN_ATTR_ADMIN_RO macro
> > Flag that CDAT was supported
> > If there is a read error then the CDAT sysfs
> > will return a 0 length entry
> >
> ...
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index c4c99ff7b55e..4bd479ec0253 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -4,10 +4,12 @@
>
> > +static int cxl_cdat_get_length(struct device *dev,
> > + struct pci_doe_mb *cdat_mb,
> > + size_t *length)
> > +{
> > + u32 cdat_request_pl = CDAT_DOE_REQ(0);
> > + u32 cdat_response_pl[32];
> > + DECLARE_COMPLETION_ONSTACK(c);
> > + struct pci_doe_task task = {
> > + .prot.vid = PCI_DVSEC_VENDOR_ID_CXL,
> > + .prot.type = CXL_DOE_PROTOCOL_TABLE_ACCESS,
> > + .request_pl = &cdat_request_pl,
> > + .request_pl_sz = sizeof(cdat_request_pl),
> > + .response_pl = cdat_response_pl,
> > + .response_pl_sz = sizeof(cdat_response_pl),
> > + .complete = cxl_doe_task_complete,
> > + .private = &c,
> > + };
> > + int rc = 0;
> > +
> > + rc = pci_doe_submit_task(cdat_mb, &task);
> > + if (rc < 0) {
> > + dev_err(dev, "DOE submit failed: %d", rc);
> > + return rc;
> > + }
> > + wait_for_completion(&c);
>
> Dan mentioned in his review that we could just use
> flush_work() and drop the completion logic and callback.
> Why didn't you go that way? Would want to wrap it up
> in pci_doe_wait_task() or something like that.
In early reviews of the Aux Bus work Dan also specified the above design
pattern.
"I would expect that the caller of this routine [pci_doe_exchange_sync]
would want to specify the task and end_task() callback and use that as
the completion signal. It may also want "no wait" behavior where it is
prepared for the DOE result to come back sometime later. With that
change the exchange fields can move into the task directly."
-- https://lore.kernel.org/linux-cxl/CAPcyv4hYAgyf-WcArGvbWHAJgc5+p=OO_6ah_dXJhNM5cXcVTw@xxxxxxxxxxxxxx/
I've renamed pci_doe_exchange_sync() pci_doe_submit_task() because of other
feedback. Here the callback is set to cxl_doe_task_complete() and we have to
wait because this particular query needs the length prior to the next task
being issued.
I use flush_workqueue() within the shutdown handling (or if someone destroys
the mailbox or abort fails) to first block new work from being submitted
(PCI_DOE_FLAG_DEAD), cancel the running work if needed (PCI_DOE_FLAG_CANCEL
[was ABORT]), and then flush the queue causing all the pending work to error
when seeing PCI_DOE_FLAG_DEAD.
Ira
>
> > +
> > + if (task.rv < 1)
> > + return -EIO;
> > +
> > + *length = cdat_response_pl[1];
> > + dev_dbg(dev, "CDAT length %zu\n", *length);
> > +
> > + return rc;
> > +}
> > +