Re: [PATCH sched_ext/for-7.0-fixes] sched_ext: Drop rq lock before calling ops.exit_task()
From: Tejun Heo
Date: Fri Mar 13 2026 - 15:05:10 EST
Hello,
On Thu, Mar 12, 2026 at 12:14:41AM +0100, Andrea Righi wrote:
> sched_ext_dead() calls scx_exit_task() while holding the rq lock, which
> invokes ops.exit_task(). If the BPF program calls helpers that acquire
> non-raw locks (e.g., bpf_task_storage_delete()), this can trigger the
> following BUG:
>
> =============================
> [ BUG: Invalid wait context ]
> 7.0.0-rc1-virtme #1 Not tainted
> -----------------------------
> (udev-worker)/115 is trying to lock:
> ffffffffa6970dd0 (rcu_tasks_trace_srcu_struct_srcu_usage.lock){....}-{3:3}, at: spin_lock_irqsave_ssp_contention+0x54/0x90
> other info that might help us debug this:
> context-{5:5}
> 3 locks held by (udev-worker)/115:
> #0: ffff8e16c634ce58 (&p->pi_lock){-.-.}-{2:2}, at: _task_rq_lock+0x2c/0x100
> #1: ffff8e16fbdbdae0 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x24/0xb0
> #2: ffffffffa6971b60 (rcu_read_lock){....}-{1:3}, at: __bpf_prog_enter+0x64/0x110
> stack backtrace:
> ...
> Sched_ext: cosmos_1.0.7_g780e898fc_dirty_x64_unknown_linux_gnu (enabled+all), task: runnable_at=-2ms
> Call Trace:
> <TASK>
> __lock_acquire+0xf86/0x1de0
> lock_acquire+0xcf/0x310
> _raw_spin_lock_irqsave+0x39/0x60
> spin_lock_irqsave_ssp_contention+0x54/0x90
> srcu_gp_start_if_needed+0x2a7/0x490
> bpf_selem_unlink+0x24b/0x590
> bpf_task_storage_delete+0x3a/0x90
> bpf_prog_3b623b4be76cfb86_scx_pmu_task_fini+0x26/0x2a
> bpf_prog_4b1530d9d9852432_cosmos_exit_task+0x1d/0x1f
> bpf__sched_ext_ops_exit_task+0x4b/0xa7
I think the better way to handle this is making sure bpf operations that we
may need are safe while holding rq lock. After all, we need to be able to
use them while holding rq lock. It doens't make a lot of sense for exit to
disallow that.
Thanks.
--
tejun