Re: [RFC PATCH v3 00/16] Core scheduling v3

From: Tim Chen
Date: Wed Sep 25 2019 - 13:24:25 EST


On 9/24/19 7:40 PM, Aubrey Li wrote:
> On Sat, Sep 7, 2019 at 2:30 AM Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
>> +static inline s64 core_sched_imbalance_delta(int src_cpu, int dst_cpu,
>> + int src_sibling, int dst_sibling,
>> + struct task_group *tg, u64 task_load)
>> +{
>> + struct sched_entity *se, *se_sibling, *dst_se, *dst_se_sibling;
>> + s64 excess, deficit, old_mismatch, new_mismatch;
>> +
>> + if (src_cpu == dst_cpu)
>> + return -1;
>> +
>> + /* XXX SMT4 will require additional logic */
>> +
>> + se = tg->se[src_cpu];
>> + se_sibling = tg->se[src_sibling];
>> +
>> + excess = se->avg.load_avg - se_sibling->avg.load_avg;
>> + if (src_sibling == dst_cpu) {
>> + old_mismatch = abs(excess);
>> + new_mismatch = abs(excess - 2*task_load);
>> + return old_mismatch - new_mismatch;
>> + }
>> +
>> + dst_se = tg->se[dst_cpu];
>> + dst_se_sibling = tg->se[dst_sibling];
>> + deficit = dst_se->avg.load_avg - dst_se_sibling->avg.load_avg;
>> +
>> + old_mismatch = abs(excess) + abs(deficit);
>> + new_mismatch = abs(excess - (s64) task_load) +
>> + abs(deficit + (s64) task_load);
>
> If I understood correctly, these formulas made an assumption that the task
> being moved to the destination is matched the destination's core cookie.

That's not the case. We do not need to match the destination's core cookie, as that
may change after context switches. It needs to reduce the load mismatch with
the destination CPU's sibling for that cgroup.

> so if
> the task is not matched with dst's core cookie and still have to stay
> in the runqueue
> then the formula becomes not correct.
>
>> /**
>> * update_sg_lb_stats - Update sched_group's statistics for load balancing.
>> * @env: The load balancing environment.
>> @@ -8345,7 +8492,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>> else
>> load = source_load(i, load_idx);
>>
>> - sgs->group_load += load;
>
> Why is this load update line removed?

This was removed accidentally. Should be restored.

>
>> + core_sched_imbalance_scan(sgs, i, env->dst_cpu);
>> +
>> sgs->group_util += cpu_util(i);
>> sgs->sum_nr_running += rq->cfs.h_nr_running;
>>
>


Thanks.

Tim