Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus

From: Thomas Gleixner
Date: Tue Jul 04 2017 - 04:50:57 EST


On Mon, 3 Jul 2017, Chen Yu wrote:
> On Sun, Jun 04, 2017 at 10:04:53PM +0200, Thomas Gleixner wrote:
> > After looking at the callsites, it's safe to change
> > irq_set_affinity_locked() so that it uses the direct affinity setter
> > function when force == true.
> >
> Sorry it took me sometime to understand this point(this is why I did not reply
> to you at the first time :-)
> I thought the defination of the word 'safe' here means, we should
> not adjust the irq affinity in the process context if the ISR is
> still running, otherwise there might be a race condition.
>
> Currently, there are four drivers would set the force flag to true(AKA,
> invoking irq_force_affinity()).
>
> 1. exynos4_mct_starting_cpu()
> The irq affinity is set before the clockevent is registered,
> so there would be no interrupt triggered when adjusting
> the irq affinity in the process context. Safe.
>
> 2. sirfsoc_local_timer_starting_cpu()
> The same as above. Safe.
>
> 3. arm_perf_starting_cpu()
> During cpu offline, the pmu interrupt(non percpu pmu interrupt)
> might be migrated to other online cpus. Then once the same cpu
> is put online, the interrupt will be set back to this cpu again
> by invoking irq_force_affinity(), but currently the pmu interrupt
> might be still running on other cpus, so it would be unsafe to adjust
> its irq affinity in the process context?

No, that's not an issue. The ability to move interrupts in process context,
or better said in any context, has nothing to do with a concurrent
interrupt. The normal mechanics for most architectures/interrupt
controllers is just to program the new affinity setting which will take
effect with the next delivered interrupt.

We just have these ill designed hardware implementations which do not allow
that. They require to change the interrupt affinity at the point when the
interrupt is handled on the original target CPU. But that's hard to achieve
when the CPU is about to go offline, because we might wait forever for an
interrupt to be raised. So in that case we need to forcefully move them
away and take the risk of colliding with an actual interrupt being raised
in hardware concurrently which has the potential to confuse the interrupt
chip.

> 4. sunhv_migrate_hvcons_irq()
> The cpu who encountered a panic needs to migrate the hvcons irq to the
> current alive cpu, and send ipi to stop other cpus. So at the time to
> adjust the irq affinity for the hvcons, the interrupt of the latter might
> be running and it might be unsafe to adjust the irq affinity in the
> process context?

None of these are related to that problem. All of these architectures can
move interrupts in process context unconditionally. It's also not relevant
which callsites invoke irq_set_affinity_locked() with force=true.

The point is whether we can change the semantics of irq_set_affinity_locked()
without breaking something.

But please answer my other mail in that thread [1] first before we start
about changing anything in that area. The affinity related changes are in
Linus tree now, so please retest against that as well.

Thanks,

tglx

[1] http://lkml.kernel.org/r/alpine.DEB.2.20.1706282036330.1890@nanos