Re: [PATCH V12 6/9] cxl/port: Read CDAT table
From: Jonathan Cameron
Date: Thu Jun 30 2022 - 11:46:12 EST
On Wed, 29 Jun 2022 20:35:34 -0700
Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> On Tue, Jun 28, 2022 at 03:47:27PM +0100, Jonathan Cameron wrote:
> > On Mon, 27 Jun 2022 21:15:24 -0700
> > ira.weiny@xxxxxxxxx wrote:
> >
> > > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> > >
> > > The OS will need CDAT data from CXL devices to properly set up
> > > interleave sets. Currently this is supported through a DOE mailbox
> > > which supports CDAT.
> > >
> > > Search the DOE mailboxes available, query CDAT data, and cache the data
> > > for later parsing.
> > >
> > > Provide a sysfs binary attribute to allow dumping of the CDAT.
> > >
> > > Binary dumping is modeled on /sys/firmware/ACPI/tables/
> > >
> > > The ability to dump this table will be very useful for emulation of real
> > > devices once they become available as QEMU CXL type 3 device emulation will
> > > be able to load this file in.
> > >
> > > This does not support table updates at runtime. It will always provide
> > > whatever was there when first cached. Handling of table updates can be
> > > implemented later.
> > >
> > > Finally create a complete list of CDAT defines within cdat.h for code
> > > wishing to decode the CDAT table.
> > >
> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> >
> > One query inline, though I'm fine with it either way, just want to
> > understand your logic in keeping completion when Dan suggested using
> > flush_work() to achieve the same thing.
> >
> > >
> > > ---
> > > Changes from V11:
> > > Adjust for the use of DOE mailbox xarray
> > > Dan Williams:
> > > Remove unnecessary get/put device
> > > Use new BIN_ATTR_ADMIN_RO macro
> > > Flag that CDAT was supported
> > > If there is a read error then the CDAT sysfs
> > > will return a 0 length entry
> > >
> > ...
> > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > > index c4c99ff7b55e..4bd479ec0253 100644
> > > --- a/drivers/cxl/core/pci.c
> > > +++ b/drivers/cxl/core/pci.c
> > > @@ -4,10 +4,12 @@
> >
> > > +static int cxl_cdat_get_length(struct device *dev,
> > > + struct pci_doe_mb *cdat_mb,
> > > + size_t *length)
> > > +{
> > > + u32 cdat_request_pl = CDAT_DOE_REQ(0);
> > > + u32 cdat_response_pl[32];
> > > + DECLARE_COMPLETION_ONSTACK(c);
> > > + struct pci_doe_task task = {
> > > + .prot.vid = PCI_DVSEC_VENDOR_ID_CXL,
> > > + .prot.type = CXL_DOE_PROTOCOL_TABLE_ACCESS,
> > > + .request_pl = &cdat_request_pl,
> > > + .request_pl_sz = sizeof(cdat_request_pl),
> > > + .response_pl = cdat_response_pl,
> > > + .response_pl_sz = sizeof(cdat_response_pl),
> > > + .complete = cxl_doe_task_complete,
> > > + .private = &c,
> > > + };
> > > + int rc = 0;
> > > +
> > > + rc = pci_doe_submit_task(cdat_mb, &task);
> > > + if (rc < 0) {
> > > + dev_err(dev, "DOE submit failed: %d", rc);
> > > + return rc;
> > > + }
> > > + wait_for_completion(&c);
> >
> > Dan mentioned in his review that we could just use
> > flush_work() and drop the completion logic and callback.
> > Why didn't you go that way? Would want to wrap it up
> > in pci_doe_wait_task() or something like that.
>
> In early reviews of the Aux Bus work Dan also specified the above design
> pattern.
>
> "I would expect that the caller of this routine [pci_doe_exchange_sync]
> would want to specify the task and end_task() callback and use that as
> the completion signal. It may also want "no wait" behavior where it is
> prepared for the DOE result to come back sometime later. With that
> change the exchange fields can move into the task directly."
>
> -- https://lore.kernel.org/linux-cxl/CAPcyv4hYAgyf-WcArGvbWHAJgc5+p=OO_6ah_dXJhNM5cXcVTw@xxxxxxxxxxxxxx/
>
> I've renamed pci_doe_exchange_sync() pci_doe_submit_task() because of other
> feedback. Here the callback is set to cxl_doe_task_complete() and we have to
> wait because this particular query needs the length prior to the next task
> being issued.
>
> I use flush_workqueue() within the shutdown handling (or if someone destroys
> the mailbox or abort fails) to first block new work from being submitted
> (PCI_DOE_FLAG_DEAD), cancel the running work if needed (PCI_DOE_FLAG_CANCEL
> [was ABORT]), and then flush the queue causing all the pending work to error
> when seeing PCI_DOE_FLAG_DEAD.
I'm lost, but I'm fine with completions anyway so no problem :)
I wasn't advocating not waiting, but potentially a different way of doing
the wait. If Dan cares about what I think he was proposing with flush_work()
I'll leave it to him to explain why.
Jonathan
>
> Ira
>
> >
> > > +
> > > + if (task.rv < 1)
> > > + return -EIO;
> > > +
> > > + *length = cdat_response_pl[1];
> > > + dev_dbg(dev, "CDAT length %zu\n", *length);
> > > +
> > > + return rc;
> > > +}
> > > +
>