RE: [EXT] Re: [PATCH v6 4/4] pcie: endpoint: pci-epf-vntb: add endpoint MSI support
From: Frank Li
Date: Wed Aug 31 2022 - 12:19:34 EST
> -----Original Message-----
> From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> Sent: Wednesday, August 31, 2022 5:42 AM
> To: Frank Li <frank.li@xxxxxxx>
> Cc: maz@xxxxxxxxxx; tglx@xxxxxxxxxxxxx; robh+dt@xxxxxxxxxx;
> krzysztof.kozlowski+dt@xxxxxxxxxx; shawnguo@xxxxxxxxxx;
> s.hauer@xxxxxxxxxxxxxx; kw@xxxxxxxxx; bhelgaas@xxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx; linux-arm-
> kernel@xxxxxxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; Peng Fan
> <peng.fan@xxxxxxx>; Aisheng Dong <aisheng.dong@xxxxxxx>;
> jdmason@xxxxxxxx; kernel@xxxxxxxxxxxxxx; festevam@xxxxxxxxx; dl-linux-
> imx <linux-imx@xxxxxxx>; kishon@xxxxxx; lorenzo.pieralisi@xxxxxxx;
> ntb@xxxxxxxxxxxxxxx; lznuaa@xxxxxxxxx
> Subject: [EXT] Re: [PATCH v6 4/4] pcie: endpoint: pci-epf-vntb: add endpoint
> MSI support
>
> Caution: EXT Email
>
> On Thu, Aug 18, 2022 at 10:11:27AM -0500, Frank Li wrote:
> > ┌───────┐ ┌──────────┐
> > │ │ │ │
> > ┌─────────────┐ │ │ │ PCI Host │
> > │ MSI │◄┐ │ │ │ │
> > │ Controller │ │ │ │ │ │
> > └─────────────┘ └─┼───────┼───
> ───────┼─BAR0 │
> > │ PCI │ │ BAR1 │
> > │ Func │ │ BAR2 │
> > │ │ │ BAR3 │
> > │ │ │ BAR4 │
> > │ ├─────────►│ │
> > └───────┘ └──────────┘
> >
>
> This diagram doesn't say which side is host and which one is endpoint.
> And not conveying any useful information.
[Frank Li] At V2 version, this diagram is in cover letter. Bjorn suggest move to here.
I think you have good background knowledge. But it should be helpful for new
People, who just touch this area.
I already mark "PCI Func" and "PCI Host".
>
> > Linux supports endpoint functions. PCI Host write BAR<n> space like write
> > to memory. The EP side can't know memory changed by the host driver.
> >
>
> I think you just say, that there is no defined way of raising IRQs by host
> to the endpoint.
>
> > PCI Spec has not defined a standard method to do that. Only define MSI(x)
> > to let EP notified RC status change.
> >
>
> MSI is from EP, right? Throughout the driver you should call it as "doorbell"
> and not MSI.
[Frank Li] What's I want said is that PCI standard define MSI(x) to let EP notify RC.
But there are not standard way for reverse direction. MSI should be correct here.
>
> > The basic idea is to trigger an IRQ when PCI RC writes to a memory
> > address. That's what MSI controller provided. EP drivers just need to
> > request a platform MSI interrupt, struct msi_msg *msg will pass down a
> > memory address and data. EP driver will map such memory address to one
> of
> > PCI BAR<n>. Host just writes such an address to trigger EP side irq.
> >
>
> IIUC (by looking at other patches in the series), the memory assigned for BAR
> region by the PCI host is mapped to the platform interrupt controller in
> PCI Endpoint. Such that, whenever the PCI host writes to the BAR region, it
> will trigger an IRQ in the Endpoint.
>
> This kind of setup is available in other platforms like Qualcomm where the
> mapping of a register region available in BAR0 and interrupt controller is
> done in the hardware itself. So whenever the PCI host writes to that register
> in BAR0, an IRQ will be delivered to the endpoint.
[Frank Li] Yes, not all platform have it. And EP driver have not provide a API
to get register region. I think platform msi API is pretty good API.
Many system have GIC ITS, so EP function driver can use it. Our test platform
have not ITS yet, so we added a simple MU-MSI driver to do it. I think qualcomm
platform can use similar method. So all EP function driver can use common method
to get notification from PCI host.
>
> > Add MSI support for pci-epf-vntb. pci-epf-vntb driver query if system
> > have MSI controller. Setup doorbell address according to struct msi_msg.
> >
> > So PCIe host can write this doorbell address to triger EP side's irq.
> >
> > If no MSI controller exist, fall back to software polling.
> >
> > Signed-off-by: Frank Li <Frank.Li@xxxxxxx>
> > ---
> > drivers/pci/endpoint/functions/pci-epf-vntb.c | 134 +++++++++++++++---
> > 1 file changed, 112 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > index 1466dd1904175..ad4f7ec8a39fc 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > @@ -44,6 +44,7 @@
> > #include <linux/pci-epc.h>
> > #include <linux/pci-epf.h>
> > #include <linux/ntb.h>
> > +#include <linux/msi.h>
> >
> > static struct workqueue_struct *kpcintb_workqueue;
> >
> > @@ -143,6 +144,8 @@ struct epf_ntb {
> > void __iomem *vpci_mw_addr[MAX_MW];
> >
> > struct delayed_work cmd_handler;
> > +
> > + int msi_virqbase;
>
> db_base?
>
> > };
> >
> > #define to_epf_ntb(epf_group) container_of((epf_group), struct epf_ntb,
> group)
> > @@ -253,7 +256,7 @@ static void epf_ntb_cmd_handler(struct
> work_struct *work)
> >
> > ntb = container_of(work, struct epf_ntb, cmd_handler.work);
> >
> > - for (i = 1; i < ntb->db_count; i++) {
> > + for (i = 1; i < ntb->db_count && !ntb->epf_db_phy; i++) {
>
> epf_db_phy is a wierd name. "phy" usually means the PHY controller (Physical
> layer) in kernel. If you are referring to physicall address of the doorbell,
> then you could use "phys".
>
> > if (readl(ntb->epf_db + i * 4)) {
> > if (readl(ntb->epf_db + i * 4))
> > ntb->db |= 1 << (i - 1);
> > @@ -454,11 +457,9 @@ static int epf_ntb_config_spad_bar_alloc(struct
> epf_ntb *ntb)
> > ctrl->num_mws = ntb->num_mws;
> > ntb->spad_size = spad_size;
> >
> > - ctrl->db_entry_size = 4;
> > -
> > for (i = 0; i < ntb->db_count; i++) {
> > ntb->reg->db_data[i] = 1 + i;
> > - ntb->reg->db_offset[i] = 0;
> > + ntb->reg->db_offset[i] = 4 * i;
> > }
> >
> > return 0;
> > @@ -509,6 +510,28 @@ static int epf_ntb_configure_interrupt(struct
> epf_ntb *ntb)
> > return 0;
> > }
> >
> > +static int epf_ntb_db_size(struct epf_ntb *ntb)
> > +{
> > + const struct pci_epc_features *epc_features;
> > + size_t size = 4 * ntb->db_count;
> > + u32 align;
> > +
> > + epc_features = pci_epc_get_features(ntb->epf->epc,
> > + ntb->epf->func_no,
> > + ntb->epf->vfunc_no);
> > + align = epc_features->align;
> > +
> > + if (size < 128)
> > + size = 128;
> > +
> > + if (align)
> > + size = ALIGN(size, align);
> > + else
> > + size = roundup_pow_of_two(size);
> > +
> > + return size;
> > +}
> > +
> > /**
> > * epf_ntb_db_bar_init() - Configure Doorbell window BARs
> > * @ntb: NTB device that facilitates communication between HOST and
> vHOST
> > @@ -520,35 +543,33 @@ static int epf_ntb_db_bar_init(struct epf_ntb
> *ntb)
> > struct device *dev = &ntb->epf->dev;
> > int ret;
> > struct pci_epf_bar *epf_bar;
> > - void __iomem *mw_addr;
> > + void __iomem *mw_addr = NULL;
> > enum pci_barno barno;
> > - size_t size = 4 * ntb->db_count;
> > + size_t size;
> >
> > epc_features = pci_epc_get_features(ntb->epf->epc,
> > ntb->epf->func_no,
> > ntb->epf->vfunc_no);
> > align = epc_features->align;
> > -
> > - if (size < 128)
> > - size = 128;
> > -
> > - if (align)
> > - size = ALIGN(size, align);
> > - else
> > - size = roundup_pow_of_two(size);
> > + size = epf_ntb_db_size(ntb);
> >
> > barno = ntb->epf_ntb_bar[BAR_DB];
> > + epf_bar = &ntb->epf->bar[barno];
> >
> > - mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> > - if (!mw_addr) {
> > - dev_err(dev, "Failed to allocate OB address\n");
> > - return -ENOMEM;
> > + if (!ntb->epf_db_phy) {
> > + mw_addr = pci_epf_alloc_space(ntb->epf, size, barno, align, 0);
> > + if (!mw_addr) {
> > + dev_err(dev, "Failed to allocate OB address\n");
>
> Expand OB.
>
> > + return -ENOMEM;
> > + }
> > + } else {
> > + epf_bar->phys_addr = ntb->epf_db_phy;
> > + epf_bar->barno = barno;
> > + epf_bar->size = size;
> > }
> >
> > ntb->epf_db = mw_addr;
> >
> > - epf_bar = &ntb->epf->bar[barno];
> > -
> > ret = pci_epc_set_bar(ntb->epf->epc, ntb->epf->func_no, ntb->epf-
> >vfunc_no, epf_bar);
> > if (ret) {
> > dev_err(dev, "Doorbell BAR set failed\n");
> > @@ -704,6 +725,74 @@ static int epf_ntb_init_epc_bar(struct epf_ntb
> *ntb)
> > return 0;
> > }
> >
> > +static void epf_ntb_write_msi_msg(struct msi_desc *desc, struct msi_msg
> *msg)
> > +{
> > + struct epf_ntb *ntb = dev_get_drvdata(desc->dev);
> > + struct epf_ntb_ctrl *reg = ntb->reg;
> > + int size = epf_ntb_db_size(ntb);
> > + u64 addr;
> > +
> > + addr = msg->address_hi;
> > + addr <<= 32;
> > + addr |= msg->address_lo;
> > +
> > + reg->db_data[desc->msi_index] = msg->data;
> > +
> > + if (desc->msi_index == 0)
> > + ntb->epf_db_phy = round_down(addr, size);
> > +
> > + reg->db_offset[desc->msi_index] = addr - ntb->epf_db_phy;
> > +}
> > +
> > +static irqreturn_t epf_ntb_interrupt_handler(int irq, void *data)
> > +{
> > + struct epf_ntb *ntb = data;
> > + int index;
> > +
> > + index = irq - ntb->msi_virqbase;
> > + ntb->db |= 1 << (index - 1);
> > + ntb_db_event(&ntb->ntb, index);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static void epf_ntb_epc_msi_init(struct epf_ntb *ntb)
> > +{
> > + struct device *dev = &ntb->epf->dev;
> > + struct irq_domain *domain;
> > + int virq;
> > + int ret;
> > + int i;
> > +
> > + domain = dev_get_msi_domain(ntb->epf->epc->dev.parent);
> > + if (!domain)
> > + return;
> > +
> > + dev_set_msi_domain(dev, domain);
> > +
> > + if (platform_msi_domain_alloc_irqs(&ntb->epf->dev,
> > + ntb->db_count,
> > + epf_ntb_write_msi_msg)) {
> > + dev_info(dev, "Can't allocate MSI, fall back to poll mode\n");
> > + return;
> > + }
> > +
> > + dev_info(dev, "vntb use MSI as doorbell\n");
> > +
>
> Why are you using the interrupt controller as the MSI controller? Why not
> just
> a plain interrupt controller?
[Frank Li] what's your means? I think only MSI controller support write memory to trigger irq.
>
> > + for (i = 0; i < ntb->db_count; i++) {
> > + virq = msi_get_virq(dev, i);
> > + ret = devm_request_irq(dev, virq,
> > + epf_ntb_interrupt_handler, 0,
> > + "ntb", ntb);
>
> "ntb" as a IRQ name seems quite generic. You might want to prefix it with
> epf
> or vntb...
>
> Thanks,
> Mani
>
> > +
> > + if (ret)
> > + dev_err(dev, "devm_request_irq() failure\n");
> > +
> > + if (!i)
> > + ntb->msi_virqbase = virq;
> > + }
> > +}
> > +
> > /**
> > * epf_ntb_epc_init() - Initialize NTB interface
> > * @ntb: NTB device that facilitates communication between HOST and
> vHOST2
> > @@ -1299,14 +1388,15 @@ static int epf_ntb_bind(struct pci_epf *epf)
> > goto err_bar_alloc;
> > }
> >
> > + epf_set_drvdata(epf, ntb);
> > + epf_ntb_epc_msi_init(ntb);
> > +
> > ret = epf_ntb_epc_init(ntb);
> > if (ret) {
> > dev_err(dev, "Failed to initialize EPC\n");
> > goto err_bar_alloc;
> > }
> >
> > - epf_set_drvdata(epf, ntb);
> > -
> > pci_space[0] = (ntb->vntb_pid << 16) | ntb->vntb_vid;
> > pci_vntb_table[0].vendor = ntb->vntb_vid;
> > pci_vntb_table[0].device = ntb->vntb_pid;
> > --
> > 2.35.1
> >
>
> --
> மணிவண்ணன் சதாசிவம்