Re: [PATCH 1/2] sched/core: Allow newidle for core-sched

From: K Prateek Nayak

Date: Wed Jun 24 2026 - 19:56:26 EST


Hello Peter,

On 6/24/2026 5:43 PM, Peter Zijlstra wrote:
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9942,9 +9942,6 @@ struct task_struct *pick_task_fair(struc
> return p;
>
> idle:
> - if (sched_core_enabled(rq))
> - return NULL;
> -
> new_tasks = sched_balance_newidle(rq, rf);

I have a sneaky feeling this might still race with a
sched_setaffinity() like:

CPU0 (CPU1 is its sibling) CPU2 (CPU1 is idle in the meanwhile)
========================== ====================================

rq0->core_pick = p;

/* Core pick on SMT - CPU1 */
pick_next_task(rq1) __sched_setaffinity(p)
pick_next_task_fair() __set_cpus_allowed_ptr() /* p cannot run on CPU0 anymore */
sched_balance_newidle() task_rq_lock(p, &rf)
raw_spin_rq_unlock() ...
... /* continues newidle */
/* Gets lock */
__set_cpus_allowed_ptr_locked()
dest_cpu = 1; /* SMT of CPU0 */
affine_move_task(rq0, p, 1 /* Move to CPU1 */)
/*
* p is not on_cpu
* p is TASK_ON_RQ_QUEUED
* p is not migration disabled
*/
move_queued_task(rq0, p, 1)
/* Moves task to CPU1 */
task_rq_unlock();

... /* Sees new task on CPU1 */
raw_spin_rq_lock(rq1);
pulled_task = 1;

/* Retry pick; Finds p on cPU1 */
return p;

rq1->core_pick = p; !!! Two CPUs have the same core pick. !!!


Do we have a guard against this that I might be missing?

Should we treat core_pick similar to task_on_cpu() and go down the
stopper route like:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2f4530eb543f..e7f64f34aa4f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3049,7 +3049,9 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
return -EINVAL;
}

- if (task_on_cpu(rq, p) || READ_ONCE(p->__state) == TASK_WAKING) {
+ if (task_on_cpu(rq, p) ||
+ task_on_core(rq, p) ||
+ READ_ONCE(p->__state) == TASK_WAKING) {
/*
* MIGRATE_ENABLE gets here because 'p == current', but for
* anything else we cannot do is_migration_disabled(), punt
---

... or we can return a RETRY_TASK when newidle balance succeeds too to
force a core-wide pick when core scheduling is enabled like:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0d212bf04885..e393aed58bfa 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9946,8 +9946,16 @@ struct task_struct *pick_task_fair(struct rq *rq, struct rq_flags *rf)
if (rq_modified_above(rq, &fair_sched_class))
return RETRY_TASK;

- if (cfs_rq->nr_queued)
+ if (cfs_rq->nr_queued) {
+ /*
+ * Force a core-wide pick if newidle
+ * balance managed to pull a task since
+ * the lock was dropped.
+ */
+ if (sched_core_enabled(rq))
+ return RETRY_TASK;
goto again;
+ }

return NULL;
}
---

Thoughts?

> if (new_tasks < 0)
> return RETRY_TASK;

--
Thanks and Regards,
Prateek