On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote:The scheduler is already not work conserving in many ways. Soft affinity is
The soft affinity CPUs present in the cpumask cpus_preferred is used by theI really dislike this implementation.
scheduler in two levels of search. First is in determining wake affine
which choses the LLC domain and secondly while searching for idle CPUs in
LLC domain. In the first level it uses cpus_preferred to prune out the
search space. In the second level it first searches the cpus_preferred and
then cpus_allowed. Using affinity_unequal flag it breaks early to avoid
any overhead in the scheduler fast path when soft affinity is not used.
This only changes the wake up path of the scheduler, the idle balancing
is unchanged; together they achieve the "softness" of scheduling.
I thought the idea was to remain work conserving (in so far as that
we're that anyway), so changing select_idle_sibling() doesn't make sense
to me. If there is idle, we use it.
Same for newidle; which you already retained.
Possibly but I don't know if similar performance behavior can be achieved
This then leaves regular balancing, and for that we can fudge with
can_migrate_task() and nr_balance_failed or something.
The numbers in the cover letter show that a static tipping point will not
And I also really don't want a second utilization tipping point; we
already have the overloaded thing.
Not sure if you mean using the existing NUMA balancer or enhancing it. If
I also still dislike how you never looked into the numa balancer, which
already has peferred_nid stuff.