Re: [patch V4 08/20] sched/mmcid: Use cpumask_weighted_or()

From: Yury Norov

Date: Wed Nov 19 2025 - 11:20:36 EST


On Sun, Nov 16, 2025 at 09:48:49PM +0100, Thomas Gleixner wrote:
> Use cpumask_weighted_or() instead of cpumask_or() and cpumask_weight() on
> the result, which walks the same bitmap twice. Results in 10-20% less
> cycles, which reduces the runqueue lock hold time.
>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>

Acked-by: Yury Norov (NVIDIA) <yury.norov@xxxxxxxxx>

> ---
> kernel/sched/core.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -10377,6 +10377,7 @@ void call_trace_sched_update_nr_running(
> static inline void mm_update_cpus_allowed(struct mm_struct *mm, const struct cpumask *affmsk)
> {
> struct cpumask *mm_allowed;
> + unsigned int weight;
>
> if (!mm)
> return;
> @@ -10387,8 +10388,8 @@ static inline void mm_update_cpus_allowe
> */
> guard(raw_spinlock)(&mm->mm_cid.lock);
> mm_allowed = mm_cpus_allowed(mm);
> - cpumask_or(mm_allowed, mm_allowed, affmsk);
> - WRITE_ONCE(mm->mm_cid.nr_cpus_allowed, cpumask_weight(mm_allowed));
> + weight = cpumask_weighted_or(mm_allowed, mm_allowed, affmsk);
> + WRITE_ONCE(mm->mm_cid.nr_cpus_allowed, weight);
> }
>
> void sched_mm_cid_exit_signals(struct task_struct *t)