RE: [PATCH V11 5/8] cxl/port: Read CDAT table

From: Dan Williams
Date: Tue Jun 21 2022 - 15:10:16 EST


Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@xxxxxxxxx>
[..]
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index c4c99ff7b55e..84dc82f7dff0 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -4,10 +4,12 @@
> > #include <linux/device.h>
> > #include <linux/delay.h>
> > #include <linux/pci.h>
> > +#include <linux/pci-doe.h>
> > #include <cxlpci.h>
> > #include <cxlmem.h>
> > #include <cxl.h>
> > #include "core.h"
> > +#include "cdat.h"
> >
> > /**
> > * DOC: cxl core pci
> > @@ -458,3 +460,173 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm)
> > return 0;
> > }
> > EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL);
> > +
> > +static struct pci_doe_mb *find_cdat_mb(struct device *uport)
> > +{
> > + struct cxl_memdev *cxlmd;
> > + struct cxl_dev_state *cxlds;
> > + int i;
> > +
> > + if (!is_cxl_memdev(uport))
> > + return NULL;
> > +
> > + cxlmd = to_cxl_memdev(uport);
> > + cxlds = cxlmd->cxlds;
>
> This feels stuck between 2 worlds. Either cxl_port_probe() needs to do
> the enumeration, or the attribute needs to move to be memdev relative.
> Given that CXL switches are going to also have CDAT data, then the
> former path needs to happen. Yes, cxl_pci still needs to do the vector
> allocation, but it does not need to do the PCI DOE probing.

It is really the interrupt setup that makes this an awkward fit all
around. The PCI core knows how to handle capabilities with interrupts,
but only for PCIe port services. DOE is both a PCIe port service *and*
and "endpoint service" like VPD (pci_vpd_init()). The more I think about
this the closer I get to the recommendation from Lukas which is that
DOE is more like pci_vpd_init() than pci_aer_init(), or a custom
enabling per driver.

If the DOE enumeration moves to a sub-function of
pci_init_capabilities() then the cxl_pci and/or cxl_port drivers just
look those up and use them. The DOE instances would remain in polled
mode unless and until a PCI driver added interrupt support late. In
other words, DOE can follow the VPD init model as long as interrupts are
not involved, and if interrupts are desired it requires late allocation
of IRQ vectors.