[PATCH 05/10] entry: Split preemption from irqentry_exit_to_kernel_mode()

From: Mark Rutland

Date: Tue Apr 07 2026 - 09:19:51 EST


Some architecture-specific work needs to be performed between the state
management for exception entry/exit and the "real" work to handle the
exception. For example, arm64 needs to manipulate a number of exception
masking bits, with different exceptions requiring different masking.

Generally this can all be hidden in the architecture code, but for arm64
the current structure of irqentry_exit_to_kernel_mode() makes this
particularly difficult to handle in a way that is correct, maintainable,
and efficient.

The gory details are described in the thread surrounding:

https://lore.kernel.org/lkml/acPAzdtjK5w-rNqC@J2N7QTR9R3/

The summary is:

* Currently, irqentry_exit_to_kernel_mode() handles both involuntary
preemption AND state management necessary for exception return.

* When scheduling (including involuntary preemption), arm64 needs to
have all arm64-specific exceptions unmasked, though regular interrupts
must be masked.

* Prior to the state management for exception return, arm64 needs to
mask a number of arm64-specific exceptions, and perform some work with
these exceptions masked (with RCU watching, etc).

While in theory it is possible to handle this with a new arch_*() hook
called somewhere under irqentry_exit_to_kernel_mode(), this is fragile
and complicated, and doesn't match the flow used for exception return to
user mode, which has a separate 'prepare' step (where preemption can
occur) prior to the state management.

To solve this, refactor irqentry_exit_to_kernel_mode() to match the
style of {irqentry,syscall}_exit_to_user_mode(), moving preemption logic
into a new irqentry_exit_to_kernel_mode_preempt() function, and moving
state management in a new irqentry_exit_to_kernel_mode_after_preempt()
function. The existing irqentry_exit_to_kernel_mode() is left as a
caller of both of these, avoiding the need to modify existing callers.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
Cc: Jinjie Ruan <ruanjinjie@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxx>
Cc: Vladimir Murzin <vladimir.murzin@xxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
---
include/linux/irq-entry-common.h | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)

Thomas/Peter/Andy, as mentioned on IRC, I haven't created kerneldoc
comments for these new functions because the existing comments don't
seem all that consistent (e.g. for user mode vs kernel mode), and I
suspect we want to rewrite them all in one go for wider consistency.

I'm happy to respin this, or to follow-up with that as per your
preference.

Mark.

diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
index 2206150e526d8..24830baa539c6 100644
--- a/include/linux/irq-entry-common.h
+++ b/include/linux/irq-entry-common.h
@@ -421,10 +421,18 @@ static __always_inline irqentry_state_t irqentry_enter_from_kernel_mode(struct p
return ret;
}

-static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, irqentry_state_t state)
+static inline void irqentry_exit_to_kernel_mode_preempt(struct pt_regs *regs, irqentry_state_t state)
{
- lockdep_assert_irqs_disabled();
+ if (regs_irqs_disabled(regs) || state.exit_rcu)
+ return;
+
+ if (IS_ENABLED(CONFIG_PREEMPTION))
+ irqentry_exit_cond_resched();
+}

+static __always_inline void
+irqentry_exit_to_kernel_mode_after_preempt(struct pt_regs *regs, irqentry_state_t state)
+{
if (!regs_irqs_disabled(regs)) {
/*
* If RCU was not watching on entry this needs to be done
@@ -443,9 +451,6 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, i
}

instrumentation_begin();
- if (IS_ENABLED(CONFIG_PREEMPTION))
- irqentry_exit_cond_resched();
-
/* Covers both tracing and lockdep */
trace_hardirqs_on();
instrumentation_end();
@@ -459,6 +464,17 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, i
}
}

+static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, irqentry_state_t state)
+{
+ lockdep_assert_irqs_disabled();
+
+ instrumentation_begin();
+ irqentry_exit_to_kernel_mode_preempt(regs, state);
+ instrumentation_end();
+
+ irqentry_exit_to_kernel_mode_after_preempt(regs, state);
+}
+
/**
* irqentry_enter - Handle state tracking on ordinary interrupt entries
* @regs: Pointer to pt_regs of interrupted context
--
2.30.2