Re: [RFC PATCH v3 00/16] Core scheduling v3
From: Tim Chen
Date: Thu Sep 12 2019 - 13:05:47 EST
On 9/12/19 5:04 AM, Aaron Lu wrote:
> Well, I have done following tests:
> 1 Julien's test script: https://paste.debian.net/plainh/834cf45c
> 2 start two tagged will-it-scale/page_fault1, see how each performs;
> 3 Aubrey's mysql test: https://github.com/aubreyli/coresched_bench.git
> They all show your patchset performs equally well...And consider what
> the patch does, I think they are really doing the same thing in
> different ways.
The new feature of my new patches attempt to load balance between cores,
and remove imbalance of cgroup load on a core that causes forced idle.
Whereas previous patches attempt for fairness of cgroup between sibling threads,
so I think the goals are kind of orthogonal and complementary.
The premise is this, say cgroup1 is occupying 50% of cpu on cpu thread 1
and 25% of cpu on cpu thread 2, that means we have a 25% cpu imbalance
and cpu is force idled 25% of the time. So ideally we need to remove
12.5% of cgroup 1 load from cpu thread 1 to sibling thread 2, so they
both run at 37.5% on both thread for cgroup1 load without causing
any force idled time. Otherwise we will try to remove 25% of cgroup1
load from cpu thread 1 to another core that has cgroup1 load to match.
This load balance is done in the regular load balance paths.
Previously for v3, only sched_core_balance made an attempt to pull a cookie task, and only
in the idle balance path. So if the cpu is kept busy, the cgroup load imbalance
between sibling threads could last a long time. And the thread fairness
patches for v3 don't help to balance load for such cases.
The new patches take into actual consideration of the amount of load imbalance
of the same group between sibling threads when selecting task to pull,
and it also prevent task migration that creates
more load imbalance. So hopefully this feature will help when we have
more cores and need load balance across the cores. This tries to help
even cgroup workload between threads to minimize forced idle time, and also
even out load across cores.
In your test, how many cores are on your machine and how many threads did
each page_fault1 spawn off?