Re: [PATCH] pci: irq: Add an early parameter to limit pci irq numbers

From: Manivannan Sadhasivam
Date: Sun May 28 2023 - 12:57:51 EST


On Thu, May 25, 2023 at 05:14:28PM +0800, Huacai Chen wrote:
> Hi, Bjorn,
>
> On Wed, May 24, 2023 at 11:21 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> >
> > [+cc Marc, LKML]
> >
> > On Wed, May 24, 2023 at 05:36:23PM +0800, Huacai Chen wrote:
> > > Some platforms (such as LoongArch) cannot provide enough irq numbers as
> > > many as logical cpu numbers. So we should limit pci irq numbers when
> > > allocate msi/msix vectors, otherwise some device drivers may fail at
> > > initialization. This patch add a cmdline parameter "pci_irq_limit=xxxx"
> > > to control the limit.
> > >
> > > The default pci msi/msix number limit is defined 32 for LoongArch and
> > > NR_IRQS for other platforms.
> >
> > The IRQ experts can chime in on this, but this doesn't feel right to
> > me. I assume arch code should set things up so only valid IRQ numbers
> > can be allocated. This doesn't seem necessarily PCI-specific, I'd
> > prefer to avoid an arch #ifdef here, and I'd also prefer to avoid a
> > command-line parameter that users have to discover and supply.
> The problem we meet: LoongArch machines can have as many as 256
> logical cpus, and the maximum of msi vectors is 192. Even on a 64-core
> machine, 192 irqs can be easily exhausted if there are several NICs
> (NIC usually allocates msi irqs depending on the number of online
> cpus). So we want to limit the msi allocation.
>

If the MSI allocation fails with multiple vectors, then the NIC driver should
revert to a single MSI vector. Is that happening in your case?

- Mani

> This is not a LoongArch-specific problem, because I think other
> platforms can also meet if they have many NICs. But of course,
> LoongArch can meet it more easily because the available msi vectors
> are very few. So, adding a cmdline parameter is somewhat reasonable.
>
> After some investigation, I think it may be possible to modify
> drivers/irqchip/irq-loongson-pch-msi.c and override
> msi_domain_info::domain_alloc_irqs() to limit msi allocation. However,
> doing that need to remove the "static" before
> __msi_domain_alloc_irqs(), which means revert
> 762687ceb31fc296e2e1406559e8bb5 ("genirq/msi: Make
> __msi_domain_alloc_irqs() static"), I don't know whether that is
> acceptable.
>
> If such a revert is not acceptable, it seems that we can only use the
> method in this patch. Maybe rename pci_irq_limits to pci_msi_limits is
> a little better.
>
> Huacai
>
> >
> > > Signed-off-by: Juxin Gao <gaojuxin@xxxxxxxxxxx>
> > > Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
> > > ---
> > > drivers/pci/msi/msi.c | 26 +++++++++++++++++++++++++-
> > > 1 file changed, 25 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> > > index ef1d8857a51b..6617381e50e7 100644
> > > --- a/drivers/pci/msi/msi.c
> > > +++ b/drivers/pci/msi/msi.c
> > > @@ -402,12 +402,34 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
> > > return ret;
> > > }
> > >
> > > +#ifdef CONFIG_LOONGARCH
> > > +#define DEFAULT_PCI_IRQ_LIMITS 32
> > > +#else
> > > +#define DEFAULT_PCI_IRQ_LIMITS NR_IRQS
> > > +#endif
> > > +
> > > +static int pci_irq_limits = DEFAULT_PCI_IRQ_LIMITS;
> > > +
> > > +static int __init pci_irq_limit(char *str)
> > > +{
> > > + get_option(&str, &pci_irq_limits);
> > > +
> > > + if (pci_irq_limits == 0)
> > > + pci_irq_limits = DEFAULT_PCI_IRQ_LIMITS;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +early_param("pci_irq_limit", pci_irq_limit);
> > > +
> > > int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> > > struct irq_affinity *affd)
> > > {
> > > int nvec;
> > > int rc;
> > >
> > > + maxvec = clamp_val(maxvec, 0, pci_irq_limits);
> > > +
> > > if (!pci_msi_supported(dev, minvec) || dev->current_state != PCI_D0)
> > > return -EINVAL;
> > >
> > > @@ -776,7 +798,9 @@ static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *en
> > > int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
> > > int maxvec, struct irq_affinity *affd, int flags)
> > > {
> > > - int hwsize, rc, nvec = maxvec;
> > > + int hwsize, rc, nvec;
> > > +
> > > + nvec = clamp_val(maxvec, 0, pci_irq_limits);
> > >
> > > if (maxvec < minvec)
> > > return -ERANGE;
> > > --
> > > 2.39.1
> > >

--
மணிவண்ணன் சதாசிவம்