Re: [RFC][PATCH 1/5] sched/fair: Fix select_idle_cpu()s cost accounting

From: Peter Zijlstra
Date: Fri Jan 08 2021 - 15:22:47 EST


On Fri, Jan 08, 2021 at 10:27:38AM +0000, Mel Gorman wrote:

> 1. avg_scan_cost is now based on the average scan cost of a rq but
> avg_idle is still scaled to the domain size. This is a bit problematic
> because it's comparing scan cost of a single rq with the estimated
> average idle time of a domain. As a result, the scan depth can be much
> larger than it was before the patch and led to some regressions.

> @@ -6164,25 +6164,25 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> */
> avg_idle = this_rq()->avg_idle / 512;
> avg_cost = this_sd->avg_scan_cost + 1;
> -
> - span_avg = sd->span_weight * avg_idle;
> - if (span_avg > 4*avg_cost)
> - nr = div_u64(span_avg, avg_cost);
> - else
> + nr = div_u64(avg_idle, avg_cost);
> + if (nr < 4)
> nr = 4;

Oooh, could it be I simply didn't remember how that code was supposed to
work and should kick my (much) younger self for not writing a comment?

Consider:

span_weight * avg_idle avg_cost
nr = ---------------------- = avg_idle / ----------
avg_cost span_weigt

Where: avg_cost / span_weight ~= cost-per-rq