Re: irqdomain API: how to set affinity of parent irq of chained irqs?
From: Marc Zyngier
Date: Fri Apr 07 2023 - 05:18:28 EST
Hi Radu,
On Fri, 07 Apr 2023 00:56:40 +0100,
Radu Rendec <rrendec@xxxxxxxxxx> wrote:
>
> Hello Marc, Marek,
>
> On Tue, 2022-05-03 at 10:32 +0100, Marc Zyngier wrote:
> > On Mon, 02 May 2022 16:45:59 +0100,
> > Marek Behún <kabel@xxxxxxxxxx> wrote:
> > >
> > > On Mon, 02 May 2022 10:31:11 +0100
> > > Marc Zyngier <maz@xxxxxxxxxx> wrote:
> > >
> > > > On Mon, 02 May 2022 09:21:37 +0100,
> > > > Marek Behún <kabel@xxxxxxxxxx> wrote:
> > > > >
> > > > > Dear Marc, Thomas,
> > > > >
> > > > > we have encountered the following problem that can hopefully be put
> > > > > some light onto: What is the intended way to set affinity (and possibly
> > > > > other irq attributes) of parent IRQ of chained IRQs, when using the
> > > > > irqdomain API?
> > > >
> > > > Simples: you can't. What sense does it make to change the affinity of
> > > > the parent interrupt, given that its fate is tied to *all* of the
> > > > other interrupts that are muxed to it?
> > >
> > > Dear Marc,
> > >
> > > thank you for your answer. Still:
> > >
> > > What about when we want to set the same affinity for all the chained
> > > interrupts?
> > >
> > > Example: on Armada 385 there are 4 PCIe controllers. Each controller
> > > has one interrupt from which we trigger chained interrupts. We would
> > > like to configure each controller to trigger interrupt (and thus all
> > > chained interrupts in the domain) on different CPU core.
> > >
> > > Moreover we would really like to do this in runtime, through sysfs,
> > > depending on for example whether there are cards plugged in the PCIe
> > > ports.
> > >
> > > Maybe there should be some mechanism to allow to change affinity for
> > > whole irqdomain, or something?
> >
> > Should? Maybe. But not for an irqdomain (which really doesn't have
> > anything to do with interrupt affinity).
> >
> > What you may want is a new sysfs interface that would allow a parent
> > interrupt affinity being changed, but also exposing to userspace all
> > the interrupts this affects *at the same time*. something like:
> >
> > /sys/kernel/irq/42/smp_affinity_list
> > /sys/kernel/irq/42/muxed_irqs/
> > /sys/kernel/irq/42/muxed_irqs/56 -> ../../56
> > /sys/kernel/irq/42/muxed_irqs/57 -> ../../57
> >
> > The main issues are that:
> >
> > - we don't really track the muxing information in any of the data
> > structures, so you can't just walk a short list and generate this
> > information. You'd need to build the topology information at
> > allocation time (or fish it out at runtime, but that's likely a
> > pain).
> >
> > - sysfs doesn't deal with affinities at all. procfs does, but adding
> > more crap there is frowned upon.
> >
> > - it *must* be a new interface. You can't repurpose the existing one,
> > as something like irqbalance would be otherwise be massively
> > confused by seeing interrupts moving around behind its back.
> >
> > - conversely, you'll need to teach irqbalance how to deal with this
> > new interface.
> >
> > - this needs to be safe against CPU hotplug. It probably already is,
> > but nobody ever tested it, given that userspace can't interact with
> > these interrupts at the moment.
>
> Are you aware of any work being done (or having been done) in this
> area? Thanks in advance!
>
> My colleagues and I are looking into picking this up and implementing
> the new sysfs interface and the related irqbalance changes, and we are
> currently evaluating the level of effort. Obviously, we would like to
> avoid any effort duplication.
I don't think anyone ever tried it (it's far easier to just moan about
it than to do anything useful). But if you want to start looking into
that, that'd be great.
One of my concern is that allowing affinity changes for chained
interrupt may uncover issues in existing drivers, so it would have to
be an explicit buy-in for any chained irqchip. That's probably not too
hard to achieve anyway given that you'll need some new infrastructure
to track the muxed interrupts.
Hopefully this will result in something actually happening! ;-)
Cheers,
M.
--
Without deviation from the norm, progress is not possible.