Re: [PATCH 15/19] sched/fair: Respect LLC preference in task migration and detach
From: Chen, Yu C
Date: Wed Oct 29 2025 - 10:24:03 EST
On 10/29/2025 11:54 AM, K Prateek Nayak wrote:
[snip]
@@ -10227,6 +10233,20 @@ static int detach_tasks(struct lb_env *env)
if (env->imbalance <= 0)
break;
+#ifdef CONFIG_SCHED_CACHE
+ /*
+ * Don't detach more tasks if the remaining tasks want
+ * to stay. We know the remaining tasks all prefer the
+ * current LLC, because after order_tasks_by_llc(), the
+ * tasks that prefer the current LLC are at the tail of
+ * the list. The inhibition of detachment is to avoid too
+ * many tasks being migrated out of the preferred LLC.
+ */
+ if (sched_cache_enabled() && detached && p->preferred_llc != -1 &&
+ llc_id(env->src_cpu) == p->preferred_llc)
+ break;
In all cases? Should we check can_migrate_llc() wrt to util migrated and
then make a call if we should move the preferred LLC tasks or not?
Prior to this "stop of detaching tasks", we performed a can_migrate_task(p)
to determine if the detached p is dequeued from its preferred LLC, and in
can_migrate_task(), we use can_migrate_llc_task() -> can_migrate_llc() to
carry out the check. That is to say, only when certain tasks have been
detached, will we stop further detaching.
Perhaps disallow it the first time if "nr_balance_failed" is 0 but
subsequent failed attempts should perhaps explore breaking the preferred
llc restriction if there is an imbalance and we are under
"mig_unrestricted" conditions.
I suppose you are suggesting that the threshold for stopping task detachment
should be higher. With the above can_migrate_llc() check, I suppose we have
raised the threshold for stopping "task detachment"?
Say the LLC is under heavy load and we only have overloaded groups.
can_migrate_llc() would return "mig_unrestricted" since
fits_llc_capacity() would return false.
Since we are under "migrate_load", sched_balance_find_src_rq() has
returned the CPU with the highest load which could very well be the
CPU with with a large number of preferred LLC tasks.
sched_cache_enabled() is still true and when detach_tasks() reaches
one of these preferred llc tasks (which comes at the very end of the
tasks list), we break out even if env->imbalance > 0 leaving
potential imbalance for the "migrate_load" case.
Instead, we can account for the util moved out of the src_llc and
after accounting for it, check if can_migrate_llc() would return
"mig_forbid" for the src llc.
I see your point, the original decision matrix intends to
spread the tasks when both LLCs are overloaded.
(src is the preferred LLC, dst is non-preferred LLC)
src \ dst 30% 40% 50% 60%
30% N N N N
40% N N N N
50% N N G G
60% Y N G G
src : src_util
dst : dst_util
Y : Yes, migrate
N : No, do not migrate
G : let the Generic load balance to even the load.
I suppose the reason why the code breaks the rule here is because
as Tim mentioned in another thread, to inhibit the task bouncing
between LLCs.
thanks,
Chenyu