[RFC] sched/isolation: Fix CPU affinity issues for several task

From: lizhe . 67
Date: Mon Apr 29 2024 - 23:40:19 EST


From: Li Zhe <lizhe.67@xxxxxxxxxxxxx>

If the parameter of cmdline "nohz_full=" contains cpu 0, the cpu affinity
of the kernel thread "kthreadd", "rcu_sched", "rcuos%d", "rcuog%d" will
always be 0x01, that is, these threads can only run on cpu 0. This is
obviously not in line with the original design.

The root cause of this problem is that variables 'cpu_valid_mask' in
functions __set_cpus_allowed_ptr_locked only contain cpu 0 before smp
initialization is completed. If we call set_cpus_allowed_ptr and pass in a
cpumask that does not contain cpu 0, the function call will return failure.
Thread "kthreadd" and "rcu_sched" call the function set_cpus_allowed_ptr
early in the system startup. Thread "rcuos%d" and "rcuog%d" inherit the
wrong cpu affinity of "kthreadd".

I tried to fix this problem by adapting the function set_cpus_allowed_ptr,
but the variable task_struct->cpus_ptr will be referenced or modified in the
scheduled process, which seems to make it more difficult to fix this problem
by adapting the function set_cpus_allowed_ptr. So this patch clear cpu 0 from
nohz_full range to fix this problem.

Signed-off-by: Li Zhe <lizhe.67@xxxxxxxxxxxxx>
---
kernel/sched/isolation.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 5891e715f00d..7b9bcfcd3c55 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -152,6 +152,13 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
if (cpumask_empty(non_housekeeping_mask))
goto free_housekeeping_staging;

+ if ((flags & HK_FLAG_KTHREAD) &&
+ cpumask_test_cpu(smp_processor_id(), non_housekeeping_mask)) {
+ pr_warn("Housekeeping: Clearing cpu %d from nohz_full range\n", smp_processor_id());
+ __cpumask_set_cpu(smp_processor_id(), housekeeping_staging);
+ __cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask);
+ }
+
if (!housekeeping.flags) {
/* First setup call ("nohz_full=" or "isolcpus=") */
enum hk_type type;
--
2.20.1