Re: [PATCH v2 1/2] sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes

From: Phil Auld
Date: Thu Nov 14 2024 - 11:00:11 EST


On Thu, Nov 14, 2024 at 02:28:09PM +0000 Juri Lelli wrote:
> When root domain non-destructive changes (e.g., only modifying one of
> the existing root domains while the rest is not touched) happen we still
> need to clear DEADLINE bandwidth accounting so that it's then properly
> restored, taking into account DEADLINE tasks associated to each cpuset
> (associated to each root domain). After the introduction of dl_servers,
> we fail to restore such servers contribution after non-destructive
> changes (as they are only considered on destructive changes when
> runqueues are attached to the new domains).
>
> Fix this by making sure we iterate over the dl_servers attached to
> domains that have not been destroyed and add their bandwidth
> contribution back correctly.
>
> Signed-off-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
>


Reviewed-by: Phil Auld <pauld@xxxxxxxxxx>


> ---
> v1->v2: always restore, considering a root domain span (and check for
> active cpus)
> ---
> kernel/sched/deadline.c | 17 ++++++++++++++---
> kernel/sched/topology.c | 8 +++++---
> 2 files changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 9ce93d0bf452..a9cdbf058871 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2970,11 +2970,22 @@ void dl_add_task_root_domain(struct task_struct *p)
>
> void dl_clear_root_domain(struct root_domain *rd)
> {
> - unsigned long flags;
> + int i;
>
> - raw_spin_lock_irqsave(&rd->dl_bw.lock, flags);
> + guard(raw_spinlock_irqsave)(&rd->dl_bw.lock);
> rd->dl_bw.total_bw = 0;
> - raw_spin_unlock_irqrestore(&rd->dl_bw.lock, flags);
> +
> + /*
> + * dl_server bandwidth is only restored when CPUs are attached to root
> + * domains (after domains are created or CPUs moved back to the
> + * default root doamin).
> + */
> + for_each_cpu(i, rd->span) {
> + struct sched_dl_entity *dl_se = &cpu_rq(i)->fair_server;
> +
> + if (dl_server(dl_se) && cpu_active(i))
> + rd->dl_bw.total_bw += dl_se->dl_bw;
> + }
> }
>
> #endif /* CONFIG_SMP */
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 9748a4c8d668..9c405f0e7b26 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2721,9 +2721,11 @@ void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
>
> /*
> * This domain won't be destroyed and as such
> - * its dl_bw->total_bw needs to be cleared. It
> - * will be recomputed in function
> - * update_tasks_root_domain().
> + * its dl_bw->total_bw needs to be cleared.
> + * Tasks contribution will be then recomputed
> + * in function dl_update_tasks_root_domain(),
> + * dl_servers contribution in function
> + * dl_restore_server_root_domain().
> */
> rd = cpu_rq(cpumask_any(doms_cur[i]))->rd;
> dl_clear_root_domain(rd);
> --
> 2.47.0
>

--