Re: [PATCH] cgroup/cpuset: make DL attach bandwidth reservation root-domain aware
From: Guopeng Zhang
Date: Sun Apr 26 2026 - 09:48:40 EST
在 2026/4/24 22:15, Waiman Long 写道:
> On 4/21/26 4:34 AM, Guopeng Zhang wrote:
>> cpuset_can_attach() currently sums the bandwidth of all migrating
>> SCHED_DEADLINE tasks and reserves destination bandwidth whenever the
>> old and new cpuset effective CPU masks do not overlap.
>>
>> That condition is stronger than what the scheduler uses when migrating
>> a deadline task. set_cpus_allowed_dl() only subtracts bandwidth from
>> the source side when moving the task requires a DL bandwidth move
>> between root domains.
>>
>> As a result, moving a deadline task between disjoint member cpusets that
>> still belong to the same root domain can reserve destination bandwidth
>> even though no matching source-side subtraction happens. Successful
>> back-and-forth migrations between such cpusets can monotonically
>> increase dl_bw->total_bw.
>>
>> Fix this by extracting the source root-domain test already used by
>> set_cpus_allowed_dl() into a shared helper and make cpuset DL bandwidth
>> preallocation use that same condition. Count all migrating deadline
>> tasks for cpuset task accounting, but only accumulate sum_migrate_dl_bw
>> for tasks that actually need a DL bandwidth move. Reserve and rollback
>> bandwidth only for that subset.
>>
>> This keeps successful attach accounting aligned with
>> set_cpus_allowed_dl() and avoids double-accounting within a single
>> root domain.
>>
>> Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
>> Signed-off-by: Guopeng Zhang <zhangguopeng@xxxxxxxxxx>
>> ---
>> include/linux/sched/deadline.h | 9 +++++++++
>> kernel/cgroup/cpuset-internal.h | 1 +
>> kernel/cgroup/cpuset.c | 34 ++++++++++++++++-----------------
>> kernel/sched/deadline.c | 14 +++++++++++---
>> 4 files changed, 38 insertions(+), 20 deletions(-)
>>
...
>> @@ -3137,6 +3135,16 @@ static void set_cpus_allowed_dl(struct task_struct *p,
>> set_cpus_allowed_common(p, ctx);
>> }
>> +bool dl_task_needs_bw_move(struct task_struct *p,
>> + const struct cpumask *new_mask)
>> +{
>> + if (!dl_task(p))
>> + return false;
>> +
>> + guard(rcu)();
>
> What do you need a RCU guard here?
Hi Longman,
Thanks for the review.
I added the RCU guard in the first version because the helper reads
task_rq(p)->rd->span, and root domains are replaced and freed through
RCU. My initial thought was to make the helper self-contained for the
rq->rd/span lifetime aspect.
After re-checking the current callers more carefully,
dl_task_needs_bw_move() is only used by cpuset_can_attach() and
set_cpus_allowed_dl() in this patch.
cpuset_can_attach() runs in the cgroup attach path, which already holds
cpus_read_lock(), and cpuset itself also holds cpuset_mutex there.
set_cpus_allowed_dl() runs under task_rq_lock()/rq->lock in the affinity
change path.
So for the current callers, the RCU guard does not appear to be
strictly necessary.
I plan to drop guard(rcu)() in the next version. Does that sound
reasonable to you?
I am also checking the Sashiko bot comments and will address them in the
next revision as appropriate.
Thanks,
Guopeng
>
> Cheers,
> Longman
>
>> + return !cpumask_intersects(task_rq(p)->rd->span, new_mask);
>> +}
>> +
>> /* Assumes rq->lock is held */
>> static void rq_online_dl(struct rq *rq)
>> {