Re: [PATCH v9 7/7] PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback

From: Koichiro Den

Date: Fri Feb 20 2026 - 12:59:50 EST


On Sat, Feb 21, 2026 at 02:42:37AM +0900, Koichiro Den wrote:
> On Thu, Feb 19, 2026 at 05:13:18PM +0900, Koichiro Den wrote:
> > Some endpoint platforms cannot use platform MSI / GIC ITS to implement
> > EP-side doorbells. In those cases, EPF drivers cannot provide an
> > interrupt-driven doorbell and often fall back to polling.
> >
> > Add an "embedded" doorbell backend that uses a controller-integrated
> > doorbell target (e.g. DesignWare integrated eDMA interrupt-emulation
> > doorbell).
> >
> > The backend locates the doorbell register and a corresponding Linux IRQ
> > via the EPC aux-resource API. If the doorbell register is already
> > exposed via a fixed BAR mapping, provide BAR+offset. Otherwise provide
> > the physical address so EPF drivers can map it into BAR space.
> >
> > When MSI doorbell allocation fails with -ENODEV,
> > pci_epf_alloc_doorbell() falls back to this embedded backend.
> >
> > Signed-off-by: Koichiro Den <den@xxxxxxxxxxxxx>
> > ---
> > Changes since v8:
> > - Add MMIO address alignment check
> > - Drop 'eDMA' word from the subject
> >
> > drivers/pci/endpoint/pci-ep-msi.c | 99 ++++++++++++++++++++++++++++++-
> > 1 file changed, 97 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/endpoint/pci-ep-msi.c b/drivers/pci/endpoint/pci-ep-msi.c
> > index 50badffa9d72..f287fbf684ca 100644
> > --- a/drivers/pci/endpoint/pci-ep-msi.c
> > +++ b/drivers/pci/endpoint/pci-ep-msi.c
> > @@ -6,6 +6,8 @@
> > * Author: Frank Li <Frank.Li@xxxxxxx>
> > */
> >
> > +#include <linux/align.h>
> > +#include <linux/cleanup.h>
> > #include <linux/device.h>
> > #include <linux/export.h>
> > #include <linux/interrupt.h>
> > @@ -36,6 +38,86 @@ static void pci_epf_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> > pci_epc_put(epc);
> > }
> >
> > +static int pci_epf_alloc_doorbell_embedded(struct pci_epf *epf, u16 num_db)
> > +{
> > + const struct pci_epc_aux_resource *doorbell = NULL;
> > + struct pci_epf_doorbell_msg *msg;
> > + struct pci_epc *epc = epf->epc;
> > + struct device *dev = &epf->dev;
> > + int count, ret, i;
> > + u64 addr;
> > +
> > + count = pci_epc_get_aux_resources(epc, epf->func_no, epf->vfunc_no,
> > + NULL, 0);
> > + if (count == -EOPNOTSUPP || count == 0)
> > + return -ENODEV;
> > + if (count < 0)
> > + return count;
> > +
> > + struct pci_epc_aux_resource *res __free(kfree) =
> > + kcalloc(count, sizeof(*res), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + ret = pci_epc_get_aux_resources(epc, epf->func_no, epf->vfunc_no,
> > + res, count);
> > + if (ret == -EOPNOTSUPP || ret == 0)
> > + return -ENODEV;
> > + if (ret < 0)
> > + return ret;
> > +
> > + count = ret;
> > +
> > + for (i = 0; i < count; i++) {
> > + if (res[i].type == PCI_EPC_AUX_DOORBELL_MMIO) {
> > + if (doorbell) {
> > + dev_warn(dev,
> > + "Duplicate DOORBELL_MMIO resource found\n");
> > + continue;
> > + }
> > + doorbell = &res[i];
> > + }
> > + }
> > + if (!doorbell)
> > + return -ENODEV;
> > +
> > + addr = doorbell->phys_addr;
> > + if (!IS_ALIGNED(addr, sizeof(u32)))
> > + return -EINVAL;
> > +
> > + msg = kcalloc(num_db, sizeof(*msg), GFP_KERNEL);
> > + if (!msg)
> > + return -ENOMEM;
> > +
> > + /*
> > + * Embedded doorbell backends (e.g. DesignWare eDMA interrupt emulation)
> > + * typically provide a single IRQ and do not offer per-doorbell
> > + * distinguishable address/data pairs. The EPC aux resource therefore
> > + * exposes one DOORBELL_MMIO entry (u.db_mmio.irq).
> > + *
> > + * Still, pci_epf_alloc_doorbell() allows requesting multiple doorbells.
> > + * For such backends we replicate the same address/data for each entry
> > + * and mark the IRQ as shared (IRQF_SHARED). Consumers must treat them
> > + * as equivalent "kick" doorbells.
> > + */
> > + for (i = 0; i < num_db; i++)
> > + msg[i] = (struct pci_epf_doorbell_msg) {
> > + .msg.address_lo = (u32)addr,
> > + .msg.address_hi = (u32)(addr >> 32),
>
> On second thought, I'm wondering whether it makes sense to handle the case where
> the embedded doorbell target resides behind an IOMMU in this series.
>
> In v9, we simply expose the raw physical address without establishing an IOMMU
> mapping. When the EPC parent device is attached to an IOMMU domain, a Host->EP
> MMIO write through the BAR window may result in an IOMMU fault.
>
> Initially, I planned to submit IOMMU support separately as a follow-up series
> once this series is accepted, to avoid making this series too large [1].
>
> However, for consistency with the MSI doorbell case when CONFIG_IRQ_MSI_IOMMU=y,
> it might be cleaner to handle the IOVA mapping as part of this series.
>
> [1] Supporting such an IOMMU-backed case would likely require additional
> patches for vNTB + ntb_transport to demonstrate usability, such as:
> https://lore.kernel.org/all/20260118135440.1958279-12-den@xxxxxxxxxxxxx/
> https://lore.kernel.org/all/20260118135440.1958279-16-den@xxxxxxxxxxxxx/
> https://lore.kernel.org/all/20260118135440.1958279-19-den@xxxxxxxxxxxxx/
>
> Perhaps the cleanest option would be to submit these three as a prerequisite
> series.
>
> Conceptually, the change would look like the following (to be applied on top of
> this v9 Patch 9/9):

(Not 9/9 but 7/7. Sorry for the confusion.)

>
>
> diff --git a/drivers/pci/endpoint/pci-ep-msi.c b/drivers/pci/endpoint/pci-ep-msi.c
> index f287fbf684ca..05423c83ae45 100644
> --- a/drivers/pci/endpoint/pci-ep-msi.c
> +++ b/drivers/pci/endpoint/pci-ep-msi.c
> @@ -44,6 +44,9 @@ static int pci_epf_alloc_doorbell_embedded(struct pci_epf *epf, u16 num_db)
> struct pci_epf_doorbell_msg *msg;
> struct pci_epc *epc = epf->epc;
> struct device *dev = &epf->dev;
> + phys_addr_t phys_base;
> + size_t map_size, off;
> + dma_addr_t iova_base;
> int count, ret, i;
> u64 addr;
>
> @@ -85,6 +88,17 @@ static int pci_epf_alloc_doorbell_embedded(struct pci_epf *epf, u16 num_db)
> if (!IS_ALIGNED(addr, sizeof(u32)))
> return -EINVAL;
>
> + phys_base = addr & PAGE_MASK;
> + off = addr - phys_base;
> + map_size = PAGE_ALIGN(off + sizeof(u32));
> +
> + iova_base = dma_map_resource(epc->dev.parent, phys_base, map_size,
> + DMA_FROM_DEVICE, 0);
> + if (dma_mapping_error(epc->dev.parent, iova_base))
> + return -EIO;
> +
> + addr = iova_base + off;
> +
> msg = kcalloc(num_db, sizeof(*msg), GFP_KERNEL);
> if (!msg)
> return -ENOMEM;
> @@ -111,6 +125,8 @@ static int pci_epf_alloc_doorbell_embedded(struct pci_epf *epf, u16 num_db)
> .bar = doorbell->bar,
> .offset = (doorbell->bar == NO_BAR) ? 0 :
> doorbell->bar_offset,
> + .iova_base = iova_base,
> + .iova_size = map_size,
> };
>
> epf->num_db = num_db;
> @@ -211,11 +227,18 @@ EXPORT_SYMBOL_GPL(pci_epf_alloc_doorbell);
>
> void pci_epf_free_doorbell(struct pci_epf *epf)
> {
> + struct pci_epf_doorbell_msg *msg0;
> + struct pci_epc *epc = epf->epc;
> +
> if (!epf->db_msg)
> return;
>
> - if (epf->db_msg[0].type == PCI_EPF_DOORBELL_MSI)
> + msg0 = &epf->db_msg[0];
> + if (msg0->type == PCI_EPF_DOORBELL_MSI)
> platform_device_msi_free_irqs_all(epf->epc->dev.parent);
> + else if (msg0->type == PCI_EPF_DOORBELL_EMBEDDED)
> + dma_unmap_resource(epc->dev.parent, msg0->iova_base,
> + msg0->iova_size, DMA_FROM_DEVICE, 0);
>
> kfree(epf->db_msg);
> epf->db_msg = NULL;
> diff --git a/include/linux/pci-epf.h b/include/linux/pci-epf.h
> index cd747447a1ea..e39251a5a6f7 100644
> --- a/include/linux/pci-epf.h
> +++ b/include/linux/pci-epf.h
> @@ -176,6 +176,8 @@ struct pci_epf_doorbell_msg {
> struct msi_msg msg;
> int virq;
> unsigned long irq_flags;
> + dma_addr_t iova_base;
> + size_t iova_size;
> enum pci_epf_doorbell_type type;
> enum pci_barno bar;
> resource_size_t offset;
>
> ----8<----
>
> Note: pci_epc_aux_resource was intentionally designed to expose a common
> 'phys_addr' field (rather than a DMA address), because some use cases require a
> raw physical address. For example, in the remote dw-edma scenario, the host side
> programs the (EP-local) physical address directly into
> dw_edma_chip->ll_region_*[i].paddr.
>
> Frank, since this would affect Patch 9/9, I would appreciate it if you could

(Not 9/9 but 7/7. Same typo as above. Sorry.)

Koichiro

> take another look and share your thoughts. I had to drop your Reviewed-by tag in
> v9 due to a small change, so a re-review would be very helpful in any case.
>
> Niklas, any comments would be appreciated.
>
> Best regards,
> Koichiro
>
>
> > + .msg.data = doorbell->u.db_mmio.data,
> > + .virq = doorbell->u.db_mmio.irq,
> > + .irq_flags = IRQF_SHARED,
> > + .type = PCI_EPF_DOORBELL_EMBEDDED,
> > + .bar = doorbell->bar,
> > + .offset = (doorbell->bar == NO_BAR) ? 0 :
> > + doorbell->bar_offset,
> > + };
> > +
> > + epf->num_db = num_db;
> > + epf->db_msg = msg;
> > + return 0;
> > +}
> > +
> > static int pci_epf_alloc_doorbell_msi(struct pci_epf *epf, u16 num_db)
> > {
> > struct pci_epf_doorbell_msg *msg;
> > @@ -109,8 +191,21 @@ int pci_epf_alloc_doorbell(struct pci_epf *epf, u16 num_db)
> > if (!ret)
> > return 0;
> >
> > - dev_err(dev, "Failed to allocate doorbell: %d\n", ret);
> > - return ret;
> > + /*
> > + * Fall back to embedded doorbell only when platform MSI is unavailable
> > + * for this EPC.
> > + */
> > + if (ret != -ENODEV)
> > + return ret;
> > +
> > + ret = pci_epf_alloc_doorbell_embedded(epf, num_db);
> > + if (ret) {
> > + dev_err(dev, "Failed to allocate doorbell: %d\n", ret);
> > + return ret;
> > + }
> > +
> > + dev_info(dev, "Using embedded (DMA) doorbell fallback\n");
> > + return 0;
> > }
> > EXPORT_SYMBOL_GPL(pci_epf_alloc_doorbell);
> >
> > --
> > 2.51.0
> >
> >