Re: [PATCH v3 4/8] sched/deadline: Rebuild root domain accounting after every update
From: Dietmar Eggemann
Date: Wed Mar 12 2025 - 05:53:44 EST
On 11/03/2025 15:51, Waiman Long wrote:
> On 3/11/25 9:29 AM, Dietmar Eggemann wrote:
>> On 11/03/2025 13:34, Waiman Long wrote:
>>> On 3/11/25 7:59 AM, Juri Lelli wrote:
>>>> On 10/03/25 20:16, Waiman Long wrote:
>>>>> On 3/10/25 3:18 PM, Waiman Long wrote:
>>>>>> On 3/10/25 2:54 PM, Dietmar Eggemann wrote:
>>>>>>> On 10/03/2025 10:37, Juri Lelli wrote:
[...]
>> Testcase: suspend/resume
>>
>> on Arm64 big.LITTLE cpumask=[LITTLE][big]=[0,3-5][1-2]
>> plus cmd line option 'isolcpus=3,4'.
>>
>> with Waiman's snippet:
>> https://lkml.kernel.org/r/fd4d6143-9bd2-4a7c-80dc-1e19e4d1b2d1@xxxxxxxxxx
>>
>> ...
>> [ 234.831675] --- > partition_sched_domains_locked() reset_domain=1
>> [ 234.835966] psci: CPU4 killed (polled 0 ms)
>> [ 234.838912] Error taking CPU3 down: -16
>> [ 234.838952] Non-boot CPUs are not disabled
>> [ 234.838986] Enabling non-boot CPUs ...
>> ...
>>
>> IIRC, that's the old DL accounting issue.
>
> You are right. cpuhp_tasks_frozen will be set in the suspend/resume
> case. In that case, we do need to add a cpuset helper to acquire the
> cpuset_mutex. A test patch as follows (no testing done yet):
>
> diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
> index c414daa7d503..ef1ffb9c52b0 100644
> --- a/include/linux/cpuset.h
> +++ b/include/linux/cpuset.h
> @@ -129,6 +129,7 @@ extern void dl_rebuild_rd_accounting(void);
> extern void rebuild_sched_domains(void);
>
> extern void cpuset_print_current_mems_allowed(void);
> +extern void cpuset_reset_sched_domains(void)
>
> /*
> * read_mems_allowed_begin is required when making decisions involving
> @@ -269,6 +270,11 @@ static inline void rebuild_sched_domains(void)
> partition_sched_domains(1, NULL, NULL);
> }
>
> +static inline void cpuset_reset_sched_domains(void)
> +{
> + partition_sched_domains(1, NULL, NULL);
> +}
> +
> static inline void cpuset_print_current_mems_allowed(void)
> {
> }
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 7995cd58a01b..a51099e5d587 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1076,6 +1076,13 @@ void rebuild_sched_domains(void)
> cpus_read_unlock();
> }
>
> +void cpuset_reset_sched_domains(void)
> +{
> + mutex_lock(&cpuset_mutex);
> + partition_sched_domains(1, NULL, NULL);
> + mutex_unlock(&cpuset_mutex);
> +}
> +
> /**
> * cpuset_update_tasks_cpumask - Update the cpumasks of tasks in the
> cpuset.
> * @cs: the cpuset in which each task's cpus_allowed mask needs to be
> changed
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 58593f4d09a1..dbf44ddbb6b4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8183,7 +8183,7 @@ static void cpuset_cpu_active(void)
> * operation in the resume sequence, just build a single
> sched
> * domain, ignoring cpusets.
> */
> - partition_sched_domains(1, NULL, NULL);
> + cpuset_reset_sched_domains();
> if (--num_cpus_frozen)
> return;
> /*
> @@ -8202,7 +8202,7 @@ static void cpuset_cpu_inactive(unsigned int cpu)
> cpuset_update_active_cpus();
> } else {
> num_cpus_frozen++;
> - partition_sched_domains(1, NULL, NULL);
> + cpuset_reset_sched_domains();
> }
> }
This seems to work. But what about a !CONFIG_CPUSETS build. In this case
we won't have this DL accounting update during suspend/resume since
dl_rebuild_rd_accounting() is empty.