Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

From: Subhra Mazumdar
Date: Tue Apr 24 2018 - 20:08:26 EST

On 04/24/2018 05:53 AM, Peter Zijlstra wrote:
On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote:
Lower the lower limit of idle cpu search in select_idle_cpu() and also put
an upper limit. This helps in scalability of the search by restricting the
search window.
@@ -6297,15 +6297,24 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
if (sched_feat(SIS_PROP)) {
u64 span_avg = sd->span_weight * avg_idle;
- if (span_avg > 4*avg_cost)
+ if (span_avg > 2*avg_cost) {
nr = div_u64(span_avg, avg_cost);
- else
- nr = 4;
+ if (nr > 4)
+ nr = 4;
+ } else {
+ nr = 2;
+ }
Why do you need to put a max on? Why isn't the proportional thing
working as is? (is the average no good because of big variance or what)
Firstly the choosing of 512 seems arbitrary. Secondly the logic here is
that the enqueuing cpu should search up to time it can get work itself.
Why is that the optimal amount to search?

Again, why do you need to lower the min; what's wrong with 4?

The reason I picked 4 is that many laptops have 4 CPUs and desktops
really want to avoid queueing if at all possible.
To find the optimum upper and lower limit I varied them over many
combinations. 4 and 2 gave the best results across most benchmarks.