Re: [PATCH] irqchip/gic-v3-its: Don't try to move a disabled irq

From: Saidi, Ali
Date: Tue Jun 02 2020 - 14:48:04 EST

ïOn 5/31/20, 9:40 PM, "Herrenschmidt, Benjamin" <benh@xxxxxxxxxx> wrote:

On Sun, 2020-05-31 at 12:09 +0100, Marc Zyngier wrote:
> > Not great indeed. But this is not, as far as I can tell, a GIC
> > driver problem.
> >
> > The semantic of activate/deactivate (which maps to started/shutdown
> > in the IRQ code) is that the HW resources for a given interrupt are
> > only committed when the interrupt is activated. Trying to perform
> > actions involving the HW on an interrupt that isn't active cannot be
> > guaranteed to take effect.
> >
> > I'd rather address it in the core code, by preventing set_affinity (and
> > potentially others) to take place when the interrupt is not in the
> > STARTED state. Userspace would get an error, which is perfectly
> > legitimate, and which it already has to deal with it for plenty of
> > other
> > reasons.

So I finally found time to dig a bit in there :) Code has changed a bit
since last I looked. But I have memories of the startup code messing
around with the affinity, and here it is. In irq_startup() :

switch (__irq_startup_managed(desc, aff, force)) {
ret = __irq_startup(desc);
irq_do_set_affinity(d, aff, false);
ret = __irq_startup(desc);
return 0;

So we have two cases here. Normal and managed.

In the managed case, we set the affinity before startup. I feel like your
patch might break that or am I missing something ?

Additionally, your patch would break any userspace program that expects to
be able to change the affinity on an interrupt before it's been started.
I don't know if such a thing exsits but the fact that we hit that bug
makes me think it might.

Now most controller drivers (at least that I'm familiar with, which doesn't
include GiC at this point) can deal with that just fine.

Now there's also another possible issue:

Your patch checks irqd_is_started(). Now I always mixup irqd vs irq_state these
days so I may be wrong but irq_state_set_started() is only done in __irq_startup
which will *not* be called if the interrupt has NOAUTOEN.

Is that ok ? Do we intend for affinity setting not to work until the first
enable_irq() for such an interrupt ? We could check activated instead of
started I suppose. (again provided I didn't mixup two different things
between the irqd and the irq_state stuff).

For these reasons my gut feeling is we should just fix GIC as Ali wanted to
do initially.

The basic idea is simply to defer the HW configuration until the interrupt
has been started. I don't see why that would be an issue. Have set_affinity just
store the mask (and apply whatever other sanity checking it might want to do)
until the itnerrupt is started and when started, apply things to HW.

I might be missing a reason why it's more complicated than that :) But I do
feel a bit uncomfortable with your approach.

Looks like the x86 apic set_affinity call explicitly checks for if itâs activated in the managed case which makes sense given the code Ben posted above:
* Core code can call here for inactive interrupts. For inactive
* interrupts which use managed or reservation mode there is no
* point in going through the vector assignment right now as the
* activation will assign a vector which fits the destination
* cpumask. Let the core code store the destination mask and be
* done with it.
if (!irqd_is_activated(irqd) &&
(apicd->is_managed || apicd->can_reserve))

My original patch should certain check activated and not disabled. With that do you still have reservations Marc?