[PATCH sched_ext/for-7.1] sched_ext: Reduce DSQ lock contention in consume_dispatch_q()
From: Andrea Righi
Date: Sat Mar 14 2026 - 19:53:00 EST
Replace raw_spin_lock() with raw_spin_trylock() when taking the DSQ lock
in consume_dispatch_q(). If the lock is contended, kick the current CPU
to retry on the next balance instead of spinning.
Under high load multiple CPUs can contend on the same DSQ lock. With a
spin_lock, waiters spin on the same cache line, wasting cycles and
increasing cache coherency traffic, which can slow the lock holder. With
trylock, waiters back off and retry later, so the holder can complete
faster and the backing-off CPUs have a chance to consume other DSQs or run
tasks.
When in bypass mode scx_kick_cpu() is suppressed, so just fall back to
raw_spin_lock() to guarantee forward progress.
Since this slightly changes the behavior of scx_bpf_dsq_move_to_local(),
update the documentation to clarify that a false return value means no
eligible task could be consumed from the DSQ. This covers both the case
of an empty DSQ and any other condition that prevented task consumption.
Benchmarks that generate many enqueue/dispatch events (e.g., schbench)
show around 2-3x higher throughput with most of the scx schedulers with
this change applied.
Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
---
kernel/sched/ext.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 9202c6d7a7713..8f48472f70f18 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -2451,6 +2451,7 @@ static bool consume_dispatch_q(struct scx_sched *sch, struct rq *rq,
struct scx_dispatch_q *dsq, u64 enq_flags)
{
struct task_struct *p;
+ s32 cpu = cpu_of(rq);
retry:
/*
* The caller can't expect to successfully consume a task if the task's
@@ -2460,7 +2461,19 @@ static bool consume_dispatch_q(struct scx_sched *sch, struct rq *rq,
if (list_empty(&dsq->list))
return false;
- raw_spin_lock(&dsq->lock);
+ /*
+ * Use trylock to avoid spinning on a contended DSQ, if we fail to
+ * acquire the lock kick the CPU to retry on the next balance.
+ *
+ * In bypass mode simply spin to acquire the lock, since
+ * scx_kick_cpu() is suppressed.
+ */
+ if (scx_bypassing(sch, cpu)) {
+ raw_spin_lock(&dsq->lock);
+ } else if (!raw_spin_trylock(&dsq->lock)) {
+ scx_kick_cpu(sch, cpu, 0);
+ return false;
+ }
nldsq_for_each_task(p, dsq) {
struct rq *task_rq = task_rq(p);
@@ -8185,8 +8198,8 @@ __bpf_kfunc void scx_bpf_dispatch_cancel(const struct bpf_prog_aux *aux)
* before trying to move from the specified DSQ. It may also grab rq locks and
* thus can't be called under any BPF locks.
*
- * Returns %true if a task has been moved, %false if there isn't any task to
- * move.
+ * Returns %true if a task has been moved, %false if no eligible task could
+ * be consumed from @dsq_id.
*/
__bpf_kfunc bool scx_bpf_dsq_move_to_local___v2(u64 dsq_id, u64 enq_flags,
const struct bpf_prog_aux *aux)
--
2.53.0