RE: [PATCH 1/2] cxl/pci: Add generic MSI-X/MSI irq support

From: Dan Williams
Date: Sat Oct 22 2022 - 18:06:05 EST


Davidlohr Bueso wrote:
> Introduce a generic irq table for CXL components/features that can have
> standard irq support - DOE requires dynamic vector sizing and is not
> considered here. For now the table is empty.
>
> Create an infrastructure to query the max vectors required for the CXL
> device. Upon successful allocation, users can plug in their respective isr
> at any point thereafter, which is supported by a new cxlds->has_irq flag,
> for example, if the irq setup is not done in the PCI driver, such as
> the case of the CXL-PMU.
>
> Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> Signed-off-by: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> ---
> drivers/cxl/cxlmem.h | 3 ++
> drivers/cxl/pci.c | 72 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 75 insertions(+)
>
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 88e3a8e54b6a..72b69b003302 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
> * @info: Cached DVSEC information about the device.
> * @serial: PCIe Device Serial Number
> * @doe_mbs: PCI DOE mailbox array
> + * @has_irq: PCIe MSI-X/MSI support
> * @mbox_send: @dev specific transport for transmitting mailbox commands
> *
> * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -247,6 +248,8 @@ struct cxl_dev_state {
>
> struct xarray doe_mbs;
>
> + bool has_irq;
> +
> int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> };
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..9c3e95ebaa26 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -428,6 +428,73 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> }
> }
>
> +/**
> + * struct cxl_irq_cap - CXL feature that is capable of receiving MSI-X/MSI irqs.
> + *
> + * @name: Name of the device/component generating this interrupt.
> + * @get_max_msgnum: Get the feature's largest interrupt message number. If the
> + * feature does not have the Interrupt Supported bit set, then
> + * return -1.
> + */
> +struct cxl_irq_cap {
> + const char *name;
> + int (*get_max_msgnum)(struct cxl_dev_state *cxlds);

Why is this a callback, why not just have the features populate their
irq numbers?

> +};
> +
> +static const struct cxl_irq_cap cxl_irq_cap_table[] = {
> + NULL
> +};
> +
> +static void cxl_pci_free_irq_vectors(void *data)
> +{
> + pci_free_irq_vectors(data);
> +}
> +
> +/*
> + * Attempt to allocate the largest amount of necessary vectors.
> + *
> + * Returns 0 upon a successful allocation of *all* vectors, or a
> + * negative value otherwise.
> + */
> +static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> + struct device *dev = cxlds->dev;
> + struct pci_dev *pdev = to_pci_dev(dev);
> + int rc, i, vectors = -1;
> +
> + for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
> + int irq;
> +
> + if (!cxl_irq_cap_table[i].get_max_msgnum)
> + continue;
> +
> + irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
> + vectors = max_t(int, irq, vectors);
> + }

Forgive me if I have missed something, I only look at interrupt enable
code once every few years, and the APIs are always a bit different, but
is this not too early to read the message number? The number is not
stable until either MSI or MSI-X has been selected below at
pci_alloc_irq_vectors() time?

> +
> + /*
> + * Semantically lack of irq support is not an error, but we
> + * still fail to allocate, so return negative.
> + */
> + if (vectors == -1)
> + return -1;
> +
> + vectors++;
> + rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
> + PCI_IRQ_MSIX | PCI_IRQ_MSI);
> + if (rc < 0)
> + return rc;
> +
> + if (rc != vectors) {
> + dev_dbg(dev, "Not enough interrupts; use polling instead.\n");
> + /* some got allocated, clean them up */
> + cxl_pci_free_irq_vectors(pdev);
> + return -ENOSPC;
> + }
> +
> + return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
> +}
> +
> static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> {
> struct cxl_register_map map;
> @@ -494,6 +561,11 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> return rc;
>
> + if (!cxl_pci_alloc_irq_vectors(cxlds)) {
> + cxlds->has_irq = true;
> + } else
> + cxlds->has_irq = false;
> +
> cxlmd = devm_cxl_add_memdev(cxlds);
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
> --
> 2.38.0
>