Re: [RFC PATCH v4 00/19] Core scheduling v4

From: Li, Aubrey
Date: Tue Jan 14 2020 - 22:43:26 EST


On 2020/1/14 23:40, Vineeth Remanan Pillai wrote:
> On Mon, Jan 13, 2020 at 8:12 PM Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
>
>> I also encountered kernel panic with the v4 code when taking cpu offline or online
>> when core scheduler is running. I've refreshed the previous patch, along
>> with 3 other patches to fix problems related to CPU online/offline.
>>
>> As a side effect of the fix, each core can now operate in core-scheduling
>> mode or non core-scheduling mode, depending on how many online SMT threads it has.
>>
>> Vineet, are you guys planning to refresh v4 and update it to v5? Aubrey posted
>> a port to the latest kernel earlier.
>>
> Thanks for the updated patch Tim.
>
> We have been testing with v4 rebased on 5.4.8 as RC kernels had given us
> trouble in the past. v5 is due soon and we are planning to release v5 when
> 5.5 comes out. As of now, v5 has your crash fixes and Aubrey's changes
> related to load balancing.

It turns out my load balancing related changes need to be refined.
For example, we don't migrate task if the task's core cookie does not match
with CPU's core cookie, but if the entire core is idle, we should allow task
migration, something like the following:

I plan to do this after my Chinese New Year holiday(Feb 3rd).

Thanks,
-Aubrey

----------------------------------------------------------------------------------
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4728f5e..75e0172 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7366,8 +7366,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
* We do not migrate tasks that are:
* 1) throttled_lb_pair, or
* 2) cannot be migrated to this CPU due to cpus_ptr, or
- * 3) running (obviously), or
- * 4) are cache-hot on their current CPU.
+ * 3) task's cookie does not match with this CPU's core cookie
+ * 4) running (obviously), or
+ * 5) are cache-hot on their current CPU.
*/
if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
return 0;
@@ -7402,6 +7403,25 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
return 0;
}

+#ifdef CONFIG_SCHED_CORE
+ if (sched_core_enabled(cpu_rq(env->dst_cpu))) {
+ bool idle_core = true;
+ int cpu;
+
+ for_each_cpu(cpu, cpu_smt_mask(env->dst_cpu)) {
+ if (!available_idle_cpu(cpu))
+ idle_core = false;
+ }
+ /*
+ * Don't migrate task if task's cookie does not match
+ * with core cookie, unless the entire core is idle.
+ */
+ if (!idle_core && p->core_cookie !=
+ cpu_rq(env->dst_cpu)->core->core_cookie)
+ return 0;
+ }
+#endif
+
/* Record that we found atleast one task that could run on dst_cpu */
env->flags &= ~LBF_ALL_PINNED;

--
2.7.4