Re: [RFC][PATCH 1/5] sched/fair: Fix select_idle_cpu()s cost accounting
From: Li, Aubrey
Date: Mon Dec 14 2020 - 22:39:00 EST
On 2020/12/15 0:48, Peter Zijlstra wrote:
> We compute the average cost of the total scan, but then use it as a
> per-cpu scan cost when computing the scan proportion. Fix this by
> properly computing a per-cpu scan cost.
>
> This also fixes a bug where we would terminate early (!--nr, case) and
> not account that cost at all.
I'm a bit worried this may introduce a regression under heavy load.
The overhead of adding another cpu_clock() and calculation becomes
significant when sis_scan is throttled by nr.
I'm not sure if it's a good idea to not account the scan cost at all
when sis_scan is throttled, that is, remove the first cpu_clock() as
well. The avg scan cost remains the value when the system is not very
busy, and when the load comes down and span avg idle > span avg cost,
we account the cost again. This should make select_idle_cpu() a bit
faster when the load is very high.
Thanks,
-Aubrey
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> kernel/sched/fair.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6144,10 +6144,10 @@ static inline int select_idle_smt(struct
> static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
> {
> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> + int cpu, loops = 1, nr = INT_MAX;
> + int this = smp_processor_id();
> struct sched_domain *this_sd;
> u64 time;
> - int this = smp_processor_id();
> - int cpu, nr = INT_MAX;
>
> this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
> if (!this_sd)
> @@ -6175,14 +6175,19 @@ static int select_idle_cpu(struct task_s
> }
>
> for_each_cpu_wrap(cpu, cpus, target) {
> - if (!--nr)
> - return -1;
> if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> break;
> +
> + if (loops >= nr) {
> + cpu = -1;
> + break;
> + }
> + loops++;
> }
>
> if (sched_feat(SIS_PROP)) {
> time = cpu_clock(this) - time;
> + time = div_u64(time, loops);
> update_avg(&this_sd->avg_scan_cost, time);
> }
>
>
>