Re: [PATCH v2 3/9] sched: Remove checks against SD_LOAD_BALANCE

From: Valentin Schneider
Date: Thu Mar 19 2020 - 08:05:57 EST



On Thu, Mar 19 2020, Dietmar Eggemann wrote:
> On 11.03.20 19:15, Valentin Schneider wrote:
>> Potential users of that flag could have been cpusets and isolcpus.
>>
>> cpusets don't need it because they define exclusive (i.e. non-overlapping)
>> domain spans, see cpuset.cpu_exclusive and cpuset.sched_load_balance.
>> If such a cpuset contains a single CPU, it will have the NULL domain
>> attached to it. If it contains several CPUs, none of their domains will
>> extend beyond the span of the cpuset.
>
> There are also non-exclusive cpusets but I assume the statement is the same.
>

Right, AFAICT the cpuset.cpu_exclusive thing doesn't actually impact the
sched_domains, only how CPUs can be allocated to cpusets. The important
bits are:

- the CPUs spanned by the cpuset
- Whether we have cpuset.sched_load_balance

> CPUs which are only used in cpusets with cpuset.sched_load_balance=0 are
> attached to the NULL sched-domain.
>

Indeed, I was only considering the case with root.sched_load_balance=0
and the siblings would have cpuset.sched_load_balance=1, in which case
we get separate root domains. If !root cpusets have
sched_load_balance=0, related CPUs will only get the NULL domain
attached to them.

> There seems to be no code which alters the SD_LOAD_BALANCE flag.
>

The sysctl interface would've been the last possible modifier.

Your comments make me realize that changelog isn't great, what about the
following?

---

The SD_LOAD_BALANCE flag is set unconditionally for all domains in
sd_init(). By making the sched_domain->flags syctl interface read-only, we
have removed the last piece of code that could clear that flag - as such,
it will now be always present. Rather than to keep carrying it along, we
can work towards getting rid of it entirely.

cpusets don't need it because they can make CPUs be attached to the NULL
domain (e.g. cpuset with sched_load_balance=0), or to a partitionned
root_domain, i.e. a sched_domain hierarchy that doesn't span the entire
system (e.g. root cpuset with sched_load_balance=0 and sibling cpusets with
sched_load_balance=1).

isolcpus apply the same "trick": isolated CPUs are explicitly taken out of
the sched_domain rebuild (using housekeeping_cpumask()), so they get the
NULL domain treatment as well.

Remove the checks against SD_LOAD_BALANCE.