[PATCH v3 0/1] kernel: kprobes: fix cur_kprobe corruption during

From: Khaja Hussain Shaik Khaji

Date: Mon Mar 02 2026 - 05:58:56 EST


This patch fixes a kprobes failure observed due to lost current_kprobe
on arm64 during kretprobe entry handling under interrupt load.

v1 attempted to address this by simulating BTI instructions as NOPs and
v2 attempted to address this by disabling preemption across the
out-of-line (XOL) execution window. Further analysis showed that this
hypothesis was incorrect: the failure is not caused by scheduling or
preemption during XOL.

The actual root cause is re-entrant invocation of kprobe_busy_begin()
from an active kprobe context. On arm64, IRQs are re-enabled before
invoking kprobe handlers, allowing an interrupt during kretprobe
entry_handler to trigger kprobe_flush_task(), which calls
kprobe_busy_begin/end and corrupts current_kprobe and kprobe_status.

[ 2280.630526] Call trace:
[ 2280.633044] dump_backtrace+0x104/0x14c
[ 2280.636985] show_stack+0x20/0x30
[ 2280.640390] dump_stack_lvl+0x58/0x74
[ 2280.644154] dump_stack+0x20/0x30
[ 2280.647562] kprobe_busy_begin+0xec/0xf0
[ 2280.651593] kprobe_flush_task+0x2c/0x60
[ 2280.655624] delayed_put_task_struct+0x2c/0x124
[ 2280.660282] rcu_core+0x56c/0x984
[ 2280.663695] rcu_core_si+0x18/0x28
[ 2280.667189] handle_softirqs+0x160/0x30c
[ 2280.671220] __do_softirq+0x1c/0x2c
[ 2280.674807] ____do_softirq+0x18/0x28
[ 2280.678569] call_on_irq_stack+0x48/0x88
[ 2280.682599] do_softirq_own_stack+0x24/0x34
[ 2280.686900] irq_exit_rcu+0x5c/0xbc
[ 2280.690489] el1_interrupt+0x40/0x60
[ 2280.694167] el1h_64_irq_handler+0x20/0x30
[ 2280.698372] el1h_64_irq+0x64/0x68
[ 2280.701872] _raw_spin_unlock_irq+0x14/0x54
[ 2280.706173] dwc3_msm_notify_event+0x6e8/0xbe8
[ 2280.710743] entry_dwc3_gadget_pullup+0x3c/0x6c
[ 2280.715393] pre_handler_kretprobe+0x1cc/0x304
[ 2280.719956] kprobe_breakpoint_handler+0x1b0/0x388
[ 2280.724878] brk_handler+0x8c/0x128
[ 2280.728464] do_debug_exception+0x94/0x120
[ 2280.732670] el1_dbg+0x60/0x7c
[ 2280.735815] el1h_64_sync_handler+0x48/0xb8
[ 2280.740114] el1h_64_sync+0x64/0x68
[ 2280.743701] dwc3_gadget_pullup+0x0/0x124
[ 2280.747827] soft_connect_store+0xb4/0x15c
[ 2280.752031] dev_attr_store+0x20/0x38
[ 2280.755798] sysfs_kf_write+0x44/0x5c
[ 2280.759564] kernfs_fop_write_iter+0xf4/0x198
[ 2280.764033] vfs_write+0x1d0/0x2b0
[ 2280.767529] ksys_write+0x80/0xf0
[ 2280.770940] __arm64_sys_write+0x24/0x34
[ 2280.774974] invoke_syscall+0x54/0x118
[ 2280.778822] el0_svc_common+0xb4/0xe8
[ 2280.782587] do_el0_svc+0x24/0x34
[ 2280.785999] el0_svc+0x40/0xa4
[ 2280.789140] el0t_64_sync_handler+0x8c/0x108
[ 2280.793526] el0t_64_sync+0x198/0x19c

This v3 patch makes kprobe_busy_begin/end re-entrant safe by preserving
the active kprobe state using a per-CPU depth counter and saved state.
The detailed failure analysis and justification are included in the
commit message.

Changes since v2:
- Dropped the scheduling/preemption-based approach.
- Identified the re-entrant kprobe_busy_begin() root cause.
- Fixed kprobe_busy_begin/end to preserve active kprobe state.
- Link to v2: https://lore.kernel.org/all/20260217133855.3142192-2-khaja.khaji@xxxxxxxxxxxxxxxx/

Khaja Hussain Shaik Khaji (1):
kernel: kprobes: fix cur_kprobe corruption during re-entrant
kprobe_busy_begin() calls

kernel/kprobes.c | 34 ++++++++++++++++++++++++++++++----
1 file changed, 30 insertions(+), 4 deletions(-)

--
2.34.1