Re: [PATCH] cgroup/cpuset: make DL attach bandwidth reservation root-domain aware

From: Waiman Long

Date: Mon Apr 27 2026 - 09:52:47 EST


On 4/26/26 9:48 AM, Guopeng Zhang wrote:

在 2026/4/24 22:15, Waiman Long 写道:
On 4/21/26 4:34 AM, Guopeng Zhang wrote:
cpuset_can_attach() currently sums the bandwidth of all migrating
SCHED_DEADLINE tasks and reserves destination bandwidth whenever the
old and new cpuset effective CPU masks do not overlap.

That condition is stronger than what the scheduler uses when migrating
a deadline task. set_cpus_allowed_dl() only subtracts bandwidth from
the source side when moving the task requires a DL bandwidth move
between root domains.

As a result, moving a deadline task between disjoint member cpusets that
still belong to the same root domain can reserve destination bandwidth
even though no matching source-side subtraction happens. Successful
back-and-forth migrations between such cpusets can monotonically
increase dl_bw->total_bw.

Fix this by extracting the source root-domain test already used by
set_cpus_allowed_dl() into a shared helper and make cpuset DL bandwidth
preallocation use that same condition. Count all migrating deadline
tasks for cpuset task accounting, but only accumulate sum_migrate_dl_bw
for tasks that actually need a DL bandwidth move. Reserve and rollback
bandwidth only for that subset.

This keeps successful attach accounting aligned with
set_cpus_allowed_dl() and avoids double-accounting within a single
root domain.

Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Signed-off-by: Guopeng Zhang <zhangguopeng@xxxxxxxxxx>
---
  include/linux/sched/deadline.h  |  9 +++++++++
  kernel/cgroup/cpuset-internal.h |  1 +
  kernel/cgroup/cpuset.c          | 34 ++++++++++++++++-----------------
  kernel/sched/deadline.c         | 14 +++++++++++---
  4 files changed, 38 insertions(+), 20 deletions(-)

...
@@ -3137,6 +3135,16 @@ static void set_cpus_allowed_dl(struct task_struct *p,
      set_cpus_allowed_common(p, ctx);
  }
  +bool dl_task_needs_bw_move(struct task_struct *p,
+               const struct cpumask *new_mask)
+{
+    if (!dl_task(p))
+        return false;
+
+    guard(rcu)();
What do you need a RCU guard here?
Hi Longman,

Thanks for the review.

I added the RCU guard in the first version because the helper reads
task_rq(p)->rd->span, and root domains are replaced and freed through
RCU. My initial thought was to make the helper self-contained for the
rq->rd/span lifetime aspect.

After re-checking the current callers more carefully,
dl_task_needs_bw_move() is only used by cpuset_can_attach() and
set_cpus_allowed_dl() in this patch.

cpuset_can_attach() runs in the cgroup attach path, which already holds
cpus_read_lock(), and cpuset itself also holds cpuset_mutex there.
set_cpus_allowed_dl() runs under task_rq_lock()/rq->lock in the affinity
change path.

So for the current callers, the RCU guard does not appear to be
strictly necessary.

I plan to drop guard(rcu)() in the next version. Does that sound
reasonable to you?

I am also checking the Sashiko bot comments and will address them in the
next revision as appropriate.

That sounds reasonable. Creation/destruction of root domains are controlled by cpuset. So root domains won't be changing when calling from the cpuset code in cpuset_can_attach().

Cheers,
Longman