Re: [PATCH V12 4/9] cxl/pci: Create PCI DOE mailbox's for memory devices

From: Jonathan Cameron
Date: Tue Jun 28 2022 - 10:33:28 EST


On Mon, 27 Jun 2022 21:15:22 -0700
ira.weiny@xxxxxxxxx wrote:

> From: Ira Weiny <ira.weiny@xxxxxxxxx>
>
> DOE mailbox objects will be needed for various mailbox communications
> with each memory device.
>
> Iterate each DOE mailbox capability and create PCI DOE mailbox objects
> as found.
>
> It is not anticipated that this is the final resting place for the
> iteration of the DOE devices. The support of switch ports will drive
> this code into the PCIe side. In this imagined architecture the CXL
> port driver would then query into the PCI device for the DOE mailbox
> array.
>
> For now creating the mailboxes in the CXL port is good enough for the
> endpoints. Later PCIe ports will need to support this to support switch
> ports more generically.
>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> Cc: Lukas Wunner <lukas@xxxxxxxxx>
> Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>

My main comment on this is that we should not paper over any errors in
DOE setup. Those indicate a bug or hardware fault, so like anything similar
we should at very least document why it makes sense to continue. In most
cases I'd argue it doesn't as something is very wrong.

>
> ---
> Changes from V11:
> Drop review from: Ben Widawsky <bwidawsk@xxxxxxxxxx>
> Remove irq code for now
> Adjust for pci_doe_get_int_msg_num()
> Adjust for pcim_doe_create_mb()
> (No longer need to handle the destroy.)
> Use xarray for DOE mailbox array
>
> Changes from V9:
> Bug fix: ensure DOE mailboxes are iterated before memdev add
> Ben Widawsky
> Set use_irq to false and just return on error.
> Don't return a value from devm_cxl_pci_create_doe()
> Skip allocating doe_mb array if there are no mailboxes
> Skip requesting irqs if none found.
> Ben/Jonathan Cameron
> s/num_irqs/max_irqs
>
> Changes from V8:
> Move PCI_DOE selection to CXL_BUS to support future patches
> which move queries into the port code.
> Remove Auxiliary device arch
> Squash the functionality of the auxiliary driver into this
> patch.
> Split out the irq handling a bit.
>
> Changes from V7:
> Minor code clean ups
> Rebased on cxl-pending
>
> Changes from V6:
> Move all the auxiliary device stuff to the CXL layer
>
> Changes from V5:
> Split the CXL specific stuff off from the PCI DOE create
> auxiliary device code.
> ---
> drivers/cxl/Kconfig | 1 +
> drivers/cxl/cxlmem.h | 3 +++
> drivers/cxl/pci.c | 37 +++++++++++++++++++++++++++++++++++++
> 3 files changed, 41 insertions(+)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index f64e3984689f..7adaaf80b302 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -2,6 +2,7 @@
> menuconfig CXL_BUS
> tristate "CXL (Compute Express Link) Devices Support"
> depends on PCI
> + select PCI_DOE
> help
> CXL is a bus that is electrically compatible with PCI Express, but
> layers three protocols on that signalling (CXL.io, CXL.cache, and
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 60d10ee1e7fc..360f282ef80c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -191,6 +191,7 @@ struct cxl_endpoint_dvsec_info {
> * @component_reg_phys: register base of component registers
> * @info: Cached DVSEC information about the device.
> * @serial: PCIe Device Serial Number
> + * @doe_mbs: PCI DOE mailbox array
> * @mbox_send: @dev specific transport for transmitting mailbox commands
> *
> * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -224,6 +225,8 @@ struct cxl_dev_state {
> resource_size_t component_reg_phys;
> u64 serial;
>
> + struct xarray doe_mbs;
> +
> int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> };
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 5a0ae46d4989..5821e6c1253b 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -8,6 +8,7 @@
> #include <linux/mutex.h>
> #include <linux/list.h>
> #include <linux/pci.h>
> +#include <linux/pci-doe.h>
> #include <linux/io.h>
> #include "cxlmem.h"
> #include "cxlpci.h"
> @@ -386,6 +387,37 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> return rc;
> }
>
> +static void cxl_pci_destroy_doe(void *mbs)
> +{
> + struct xarray *xa = mbs;

Local variable doesn't add anything...

> +
> + xa_destroy(xa);
> +}
> +
> +static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> +{
> + struct device *dev = cxlds->dev;
> + struct pci_dev *pdev = to_pci_dev(dev);
> + u16 off = 0;
> +
> + pci_doe_for_each_off(pdev, off) {
> + struct pci_doe_mb *doe_mb;
> +
> + doe_mb = pcim_doe_create_mb(pdev, off, -1);
> + if (IS_ERR(doe_mb)) {
> + pci_err(pdev,
> + "Failed to create MB object for MB @ %x\n",
> + off);

Definitely at least need a comment for why papering over this failure is
fine. My gut feeling is we shouldn't ignore it.

> + doe_mb = NULL;
> + }
> +
> + if (xa_insert(&cxlds->doe_mbs, off, doe_mb, GFP_KERNEL))
> + break;

If we hit that break something has gone horribly wrong and we shouldn't
paper over it either. We might have a partial list of DOEs and callers
after this will have no way of knowing it isn't the full list.

> +
> + pci_dbg(pdev, "Created DOE mailbox @%x\n", off);
> + }
> +}
> +
> static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> {
> struct cxl_register_map map;
> @@ -408,6 +440,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (IS_ERR(cxlds))
> return PTR_ERR(cxlds);
>
> + xa_init(&cxlds->doe_mbs);
> + devm_add_action(&pdev->dev, cxl_pci_destroy_doe, &cxlds->doe_mbs);

_or_reset()? If the devm registration itself fails we want to bail out cleanly.
It's vanishingly unlikely to happen, but we should still handle that case.

> +
> cxlds->serial = pci_get_dsn(pdev);
> cxlds->cxl_dvsec = pci_find_dvsec_capability(
> pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
> @@ -434,6 +469,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> cxlds->component_reg_phys = cxl_regmap_to_base(pdev, &map);
>
> + devm_cxl_pci_create_doe(cxlds);
> +
> rc = cxl_pci_setup_mailbox(cxlds);
> if (rc)
> return rc;