Re: commit 3d5bfbd97163 versus -rt

From: Thomas Gleixner
Date: Fri Jun 18 2021 - 16:04:11 EST


On Tue, Jun 15 2021 at 14:35, Rasmus Villemoes wrote:
> [<8016fb80>] (generic_handle_irq) from [<8043e98c>]
> (mpc8xxx_gpio_irq_cascade+0xac/0xd0)
> [<8043e8e0>] (mpc8xxx_gpio_irq_cascade) from [<80171c80>]
> (irq_forced_thread_fn+0x38/0x8c)
> r5:80e282c0 r4:80deda00
> [<80171c48>] (irq_forced_thread_fn) from [<80171eb4>]
> (irq_thread+0x11c/0x238)
.
> Reverting commit 3d5bfbd9716318b1ca5c38488aa69f64d38a9aa5 (gpio:
> mpc8xxx: change the gpio interrupt flags.) makes it go away, as does
> disabling CONFIG_PREEMPT_RT or simply booting a vanilla v5.10.42 (where
> that option exists but cannot be selected).
>
> This seems to be the kind of thing where an -rt expert can immediately
> see what's wrong and how to fix it. Ideas anyone?

Let me explain. The force threaded interrupt code in mainline made the
wrong assumption that disabling softirqs is sufficient to provide the
semantics of non-threaded handler invocation. This turned out to be
wrong and caused people to do fancy workarounds.

See 81e2073c175b ("genirq: Disable interrupts for force threaded
handlers") for details.

So people started to remove IRQF_NO_THREAD or local_irq_save/restore
pairs in drivers.

But 3d5bfbd97163 ("gpio: mpc8xxx: change the gpio interrupt flags.") has
nothing to do with that:

"Delete the interrupt IRQF_NO_THREAD flags in order to gpio interrupts
can be threaded to allow high-priority processes to preempt."

This changelog is blatantly wrong. In mainline forced irq threads have
always been invoked with softirqs disabled, which obviously makes them
non-preemptible. And on RT this would have exploded exactly in the way
you observed.

The commit predates 81e2073c175b and therefore was obviously never
tested neither on mainline nor on RT.

Bartosz, please revert this ASAP.

I'll add a lockdep assert into generic_handle_irq() to make it obvious
where the problem is. That won't help with completely untested garbage
of course.

81e2073c175b is obviously a problem for demultiplexing drivers which
make weird assumptions when RT comes into play and I'm not surprised
that there is some fallout which we need to look at. Especially when
people start to zap IRQF_NO_THREAD or irq_save/restore pairs blindly.

Thanks,

tglx