Re: [PATCH 1/2] arm64/entry: Fix involuntary preemption exception masking

From: Thomas Gleixner

Date: Fri Mar 20 2026 - 11:51:27 EST


On Fri, Mar 20 2026 at 14:57, Mark Rutland wrote:
> On Fri, Mar 20, 2026 at 03:11:20PM +0100, Thomas Gleixner wrote:
>> Yes. It's not an optimization. It's a correctness issue.
>>
>> If the interrupted context is RCU idle then you have to carefully go
>> back to that context. So that the context can tell RCU it is done with
>> the idle state and RCU has to pay attention again. Otherwise all of this
>> becomes imbalanced.
>>
>> This is about context-level nesting:
>>
>> ...
>> L1.A ct_cpuidle_enter();
>>
>> -> interrupt
>> L2.A ct_irq_enter();
>> ... // Set NEED_RESCHED
>> L2.B ct_irq_exit();
>>
>> ...
>> L1.B ct_cpuidle_exit();
>>
>> Scheduling between #L2.B and #L1.B makes RCU rightfully upset.
>
> I suspect I'm missing something obvious here:
>
> * Regardless of nesting, I see that scheduling between L2.B and L1.B is
> broken because RCU isn't watching.
>
> * I'm not sure whether there's a problem with scheduling between L2.A
> and L2.B, which is what arm64 used to do, and what arm64 would do
> after this patch.

The only reason why it "works" is that the idle task has preemption
permanently disabled, so it won't really schedule even if need_resched()
is set. So it "works" by chance and not by design.

Apply the patch below and watch the show.

> Thanks for all of this. Even if I'm confused right now, it's very
> helpful!

RCU induced confusion is perfectly normal. Everyone suffers from that at
some point. Welcome to the club.

Thanks,

tglx
---
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -187,9 +187,10 @@ static inline bool arch_irqentry_exit_ne

void raw_irqentry_exit_cond_resched(void)
{
+ rcu_irq_exit_check_preempt();
+
if (!preempt_count()) {
/* Sanity check RCU and thread stack */
- rcu_irq_exit_check_preempt();
if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
WARN_ON_ONCE(!on_thread_stack());
if (need_resched() && arch_irqentry_exit_need_resched())