Re: [PATCH 1/6] sched_ext: idle: Extend topology optimizations to all tasks

From: Andrea Righi
Date: Tue Mar 18 2025 - 03:31:47 EST


On Mon, Mar 17, 2025 at 08:22:35AM -1000, Tejun Heo wrote:
...
> > + /*
> > + * If the task is allowed to run on all CPUs, simply use the
> > + * architecture's cpumask directly. Otherwise, compute the
> > + * intersection of the architecture's cpumask and the task's
> > + * allowed cpumask.
> > + */
> > + if (!cpus || p->nr_cpus_allowed >= num_possible_cpus() ||
> > + cpumask_subset(cpus, p->cpus_ptr))
> > + return cpus;
> > +
> > + if (!cpumask_equal(cpus, p->cpus_ptr) &&
>
> Hmm... isn't this covered by the preceding cpumask_subset() test? Here, cpus
> is not a subset of p->cpus_ptr, so how can it be the same as p->cpus_ptr?
>
> > + cpumask_and(local_cpus, cpus, p->cpus_ptr))
> > + return local_cpus;
> > +
> > + return NULL;

Also, I'm also wondering if there's really a benefit checking for
cpumask_subset() and then doing cpumask_and() only when it's needed, or if
we should just do cpumask_and(). It's true that we can save some writes,
but they're done on a temporary local per-CPU cpumask, so they shouldn't
introduce cache contention.

-Andrea