Re: [PATCH 05/10] entry: Split preemption from irqentry_exit_to_kernel_mode()

From: Jinjie Ruan

Date: Tue Apr 07 2026 - 21:42:06 EST




On 2026/4/7 21:16, Mark Rutland wrote:
> Some architecture-specific work needs to be performed between the state
> management for exception entry/exit and the "real" work to handle the
> exception. For example, arm64 needs to manipulate a number of exception
> masking bits, with different exceptions requiring different masking.
>
> Generally this can all be hidden in the architecture code, but for arm64
> the current structure of irqentry_exit_to_kernel_mode() makes this
> particularly difficult to handle in a way that is correct, maintainable,
> and efficient.
>
> The gory details are described in the thread surrounding:
>
> https://lore.kernel.org/lkml/acPAzdtjK5w-rNqC@J2N7QTR9R3/
>
> The summary is:
>
> * Currently, irqentry_exit_to_kernel_mode() handles both involuntary
> preemption AND state management necessary for exception return.
>
> * When scheduling (including involuntary preemption), arm64 needs to
> have all arm64-specific exceptions unmasked, though regular interrupts
> must be masked.
>
> * Prior to the state management for exception return, arm64 needs to
> mask a number of arm64-specific exceptions, and perform some work with
> these exceptions masked (with RCU watching, etc).
>
> While in theory it is possible to handle this with a new arch_*() hook
> called somewhere under irqentry_exit_to_kernel_mode(), this is fragile
> and complicated, and doesn't match the flow used for exception return to
> user mode, which has a separate 'prepare' step (where preemption can
> occur) prior to the state management.
>
> To solve this, refactor irqentry_exit_to_kernel_mode() to match the
> style of {irqentry,syscall}_exit_to_user_mode(), moving preemption logic
> into a new irqentry_exit_to_kernel_mode_preempt() function, and moving
> state management in a new irqentry_exit_to_kernel_mode_after_preempt()
> function. The existing irqentry_exit_to_kernel_mode() is left as a
> caller of both of these, avoiding the need to modify existing callers.
>
> There should be no functional change as a result of this patch.
>
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Jinjie Ruan <ruanjinjie@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxx>
> Cc: Vladimir Murzin <vladimir.murzin@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> ---
> include/linux/irq-entry-common.h | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> Thomas/Peter/Andy, as mentioned on IRC, I haven't created kerneldoc
> comments for these new functions because the existing comments don't
> seem all that consistent (e.g. for user mode vs kernel mode), and I
> suspect we want to rewrite them all in one go for wider consistency.
>
> I'm happy to respin this, or to follow-up with that as per your
> preference.
>
> Mark.
>
> diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
> index 2206150e526d8..24830baa539c6 100644
> --- a/include/linux/irq-entry-common.h
> +++ b/include/linux/irq-entry-common.h
> @@ -421,10 +421,18 @@ static __always_inline irqentry_state_t irqentry_enter_from_kernel_mode(struct p
> return ret;
> }
>
> -static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, irqentry_state_t state)
> +static inline void irqentry_exit_to_kernel_mode_preempt(struct pt_regs *regs, irqentry_state_t state)
> {
> - lockdep_assert_irqs_disabled();
> + if (regs_irqs_disabled(regs) || state.exit_rcu)
> + return;
> +
> + if (IS_ENABLED(CONFIG_PREEMPTION))
> + irqentry_exit_cond_resched();
> +}
>
> +static __always_inline void
> +irqentry_exit_to_kernel_mode_after_preempt(struct pt_regs *regs, irqentry_state_t state)
> +{
> if (!regs_irqs_disabled(regs)) {
> /*
> * If RCU was not watching on entry this needs to be done
> @@ -443,9 +451,6 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, i
> }
>
> instrumentation_begin();
> - if (IS_ENABLED(CONFIG_PREEMPTION))
> - irqentry_exit_cond_resched();
> -
> /* Covers both tracing and lockdep */
> trace_hardirqs_on();
> instrumentation_end();
> @@ -459,6 +464,17 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, i
> }
> }
>
> +static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, irqentry_state_t state)
> +{
> + lockdep_assert_irqs_disabled();
> +
> + instrumentation_begin();
> + irqentry_exit_to_kernel_mode_preempt(regs, state);
> + instrumentation_end();
> +
> + irqentry_exit_to_kernel_mode_after_preempt(regs, state);
> +}

Reviewed-by: Jinjie Ruan <ruanjinjie@xxxxxxxxxx>

> +
> /**
> * irqentry_enter - Handle state tracking on ordinary interrupt entries
> * @regs: Pointer to pt_regs of interrupted context