Re: [PATCH] genirq: allow selection of number of sparse irqs
From: Marc Zyngier
Date: Sat Jul 30 2022 - 05:59:16 EST
On Fri, 29 Jul 2022 19:21:56 +0100,
Daniel Walker <danielwa@xxxxxxxxx> wrote:
>
> On Thu, Jul 28, 2022 at 09:52:18AM +0100, Marc Zyngier wrote:
> > On 2022-07-28 04:04, Daniel Walker wrote:
> > > Currently the maximum number of interrupters is capped at 8260 (64 +
> > > 8196) in most of the architectures were CONFIG_SPARSE_IRQ is selected.
> > > This upper limit is not sufficient for couple of existing SoC's from
> > > Marvell.
> > > For eg: Octeon TX2 series of processors support a maximum of 32K
> > > interrupters.
> > >
> > > Allow configuration of the upper limit of the number of interrupts.
> > >
> > > Cc: George Cherian <george.cherian@xxxxxxxxxxx>
> > > Cc: sgoutham@xxxxxxxxxxx
> > > Cc: "BOBBY Liu (bobbliu)" <bobbliu@xxxxxxxxx>
> > > Cc: xe-linux-external@xxxxxxxxx
> > > Signed-off-by: Daniel Walker <danielwa@xxxxxxxxx>
> > > ---
> > > kernel/irq/Kconfig | 23 +++++++++++++++++++++++
> > > kernel/irq/internals.h | 10 +++++++++-
> > > 2 files changed, 32 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
> > > index db3d174c53d4..b356217abcfe 100644
> > > --- a/kernel/irq/Kconfig
> > > +++ b/kernel/irq/Kconfig
> > > @@ -125,6 +125,29 @@ config SPARSE_IRQ
> > >
> > > If you don't know what to do here, say N.
> > >
> > > +choice
> > > + prompt "Select number of sparse irqs"
> > > + depends on SPARSE_IRQ
> > > + default SPARSE_IRQ_EXTEND_8K
> > > + help
> > > + Allows choosing the number of sparse irq's available on the
> > > + system. For each 8k of additional irqs added there is
> > > approximatly
> > > + 1kb of kernel size increase.
> > > +
> > > +config SPARSE_IRQ_EXTEND_8K
> > > + bool "8k"
> > > +
> > > +config SPARSE_IRQ_EXTEND_16K
> > > + bool "16K"
> > > +
> > > +config SPARSE_IRQ_EXTEND_32K
> > > + bool "32K"
> > > +
> > > +config SPARSE_IRQ_EXTEND_64K
> > > + bool "64K"
> > > +
> > > +endchoice
> > > +
> > > config GENERIC_IRQ_DEBUGFS
> > > bool "Expose irq internals in debugfs"
> > > depends on DEBUG_FS
> > > diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
> > > index f09c60393e55..25fe5abf6c16 100644
> > > --- a/kernel/irq/internals.h
> > > +++ b/kernel/irq/internals.h
> > > @@ -12,7 +12,15 @@
> > > #include <linux/sched/clock.h>
> > >
> > > #ifdef CONFIG_SPARSE_IRQ
> > > -# define IRQ_BITMAP_BITS (NR_IRQS + 8196)
> > > +# if defined(CONFIG_SPARSE_IRQ_EXTEND_8K)
> > > +# define IRQ_BITMAP_BITS (NR_IRQS + 8192 + 4)
> > > +# elif defined(CONFIG_SPARSE_IRQ_EXTEND_16K)
> > > +# define IRQ_BITMAP_BITS (NR_IRQS + 16384 + 4)
> > > +# elif defined(CONFIG_SPARSE_IRQ_EXTEND_32K)
> > > +# define IRQ_BITMAP_BITS (NR_IRQS + 32768 + 4)
> > > +# elif defined(CONFIG_SPARSE_IRQ_EXTEND_64K)
> > > +# define IRQ_BITMAP_BITS (NR_IRQS + 65536 + 4)
> > > +# endif
> > > #else
> > > # define IRQ_BITMAP_BITS NR_IRQS
> > > #endif
> >
> > It really feels like the wrong approach. If your system
> > supports a large number of interrupt (I guess it has
> > a GICv3 ITS), this shouldn't impact the lesser machines
> > (most people are using a distro kernel).
> >
> > It also doesn't really scale: the GICv3 architecture gives
> > you up to 24 bits of interrupts. Are we going to allocate
> > 2MB worth of bitmap? Future interrupt architectures may have
> > even larger interrupt spaces.
> >
> > As it turns out, we already store the irqdesc in an rb-tree.
> > It doesn't take too much imagination to turn this into a
> > xarray and use it for both allocation and tracking.
> >
> > It would also conveniently replace the irqs_resend bitmap
> > if using marks to flag the IRQs to be resent.
>
> Marvell submitted a similar change, but non-selectable, about a
> month ago.
Which wasn't really acceptable either.
>
> The limitation prevents Cisco and Marvell hardware from
> functioning. I don't think we're well versed enough on the generic
> irq system to implement what your suggesting, even if we did Thomas
> would not likely accept it.
I don't think you can speak for Thomas here. In my experience of
working with him, he's in general much more inclined to look at a
scalable, long term solution than at a point hack. Specially given
that we already use xarrays for MSIs.
> Your suggestion is more of a long term solution vs. our short term
> solution.
Exactly. Experience shows that short term hacks are almost always a
bad idea and result in something that isn't maintainable.
> I'm not wedded to any solution, we just need to relieve
> the limitation so our hardware starts working. I would imagine other
> companies have this issue, but I don't know which ones currently.
This architecture has been in the wild for the best part of 10 years,
in Linux for 8 years, and nobody so far screamed because of this
perceived limitation. It would help if you described exactly what
breaks in your system, because just saying "give me more" is not
exactly helping (there are other limitations in the GICv3 ITS driver
that may bite you anyway).
> I would rather to use an upstream solution verses holding the
> patches privately. I would suggest if this limitation would not be
> overcome for 3-4 releases the short term solution should be
> acceptable over that time frame to be replaced by something else
> after that.
If you want to have an impact on the features being merged in the
upstream kernel, a good start would be to take feedback on board.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.