Re: [PATCH v3 1/1] kernel: kprobes: fix cur_kprobe corruption during re-entrant kprobe_busy_begin() calls
From: Mark Rutland
Date: Mon Mar 02 2026 - 08:45:45 EST
On Mon, Mar 02, 2026 at 04:23:47PM +0530, Khaja Hussain Shaik Khaji wrote:
> Fix cur_kprobe corruption that occurs when kprobe_busy_begin() is called
> re-entrantly during an active kprobe handler.
>
> Previously, kprobe_busy_begin() unconditionally overwrites current_kprobe
> with &kprobe_busy, and kprobe_busy_end() writes NULL. This approach works
> correctly when no kprobe is active but fails during re-entrant calls.
The structure of kprobe_busy_begin() and kprobe_busy_end() implies that
re-entrancy is unexpected, and something that should be avoided somehow.
Is that the case, or are kprobe_busy_begin() and kprobe_busy_end()
generally buggy?
> On arm64, arm64_enter_el1_dbg() re-enables IRQs before invoking kprobe
> handlers.
No, arm64_enter_el1_dbg() does not re-enable IRQs. It only manages state
tracking.
I don't know if you meant to say a different function here, but this
statement is clearly wrong.
> This allows an IRQ during kretprobe
> entry_handler to trigger kprobe_flush_task() via softirq, which calls
> kprobe_busy_begin/end and corrupts cur_kprobe.
This would be easier to follow if the backtrace were included in the
commit message, rather than in the cover letter, such that it could be
referred to easily.
> Problem flow: kretprobe entry_handler -> IRQ -> softirq ->
> kprobe_flush_task -> kprobe_busy_begin/end -> cur_kprobe corruption.
We shouldn't take the IRQ in the first place here. AFAICT, nothing
unmasks IRQs prior to the entry handler.
That suggests that something is going wrong *within* your entry handler
that causes IRQs to be unmasked unexpectedly.
Please can we find out *exactly* where IRQs get unmasked for the first
time?
Mark.
>
> This corruption causes two issues:
> 1. NULL cur_kprobe in setup_singlestep leading to panic in single-step
> handler
> 2. kprobe_status overwritten with HIT_ACTIVE during execute-out-of-line
> window
>
> Implement a per-CPU re-entrancy tracking mechanism with:
> - A depth counter to track nested calls
> - Saved state for current_kprobe and kprobe_status
> - Save state on first entry, restore on final exit
> - Increment depth counter for nested calls only
>
> This approach maintains compatibility with existing callers as
> save/restore of NULL is a no-op.
>
> Signed-off-by: Khaja Hussain Shaik Khaji <khaja.khaji@xxxxxxxxxxxxxxxx>
> ---
> kernel/kprobes.c | 34 ++++++++++++++++++++++++++++++----
> 1 file changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index e2cd01cf5968..47a4ae50ee6c 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -70,6 +70,15 @@ static bool kprobes_all_disarmed;
> static DEFINE_MUTEX(kprobe_mutex);
> static DEFINE_PER_CPU(struct kprobe *, kprobe_instance);
>
> +/* Per-CPU re-entrancy state for kprobe_busy_begin/end.
> + * kprobe_busy_begin() may be called while a kprobe handler
> + * is active - e.g. kprobe_flush_task() via softirq during
> + * kretprobe entry_handler on arm64 where IRQs are re-enabled.
> + */
> +static DEFINE_PER_CPU(int, kprobe_busy_depth);
> +static DEFINE_PER_CPU(struct kprobe *, kprobe_busy_saved_current);
> +static DEFINE_PER_CPU(unsigned long, kprobe_busy_saved_status);
> +
> kprobe_opcode_t * __weak kprobe_lookup_name(const char *name,
> unsigned int __unused)
> {
> @@ -1307,14 +1316,31 @@ void kprobe_busy_begin(void)
> struct kprobe_ctlblk *kcb;
>
> preempt_disable();
> - __this_cpu_write(current_kprobe, &kprobe_busy);
> - kcb = get_kprobe_ctlblk();
> - kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> + if (__this_cpu_read(kprobe_busy_depth) == 0) {
> + kcb = get_kprobe_ctlblk();
> + __this_cpu_write(kprobe_busy_saved_current,
> + __this_cpu_read(current_kprobe));
> + __this_cpu_write(kprobe_busy_saved_status,
> + kcb->kprobe_status);
> + __this_cpu_write(current_kprobe, &kprobe_busy);
> + kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> + }
> + __this_cpu_inc(kprobe_busy_depth);
> }
>
> void kprobe_busy_end(void)
> {
> - __this_cpu_write(current_kprobe, NULL);
> + struct kprobe_ctlblk *kcb;
> +
> + __this_cpu_dec(kprobe_busy_depth);
> +
> + if (__this_cpu_read(kprobe_busy_depth) == 0) {
> + kcb = get_kprobe_ctlblk();
> + __this_cpu_write(current_kprobe,
> + __this_cpu_read(kprobe_busy_saved_current));
> + kcb->kprobe_status =
> + __this_cpu_read(kprobe_busy_saved_status);
> + }
> preempt_enable();
> }
>
> --
> 2.34.1
>