Re: [PATCH v2 1/3] sched/fair: fix rounding issue for asym packing

From: Valentin Schneider
Date: Wed Dec 19 2018 - 06:58:42 EST

On 19/12/2018 08:32, Vincent Guittot wrote:
> This is another UC, asym packing is used at SMT level for now and we
> don't face this kind of problem, it has been also tested and DynamiQ
> configuration which is similar to SMT : 1 CPU per sched_group
> The legacy b.L one was not the main target although it can be
> The rounding error is there and should be fixed and your UC is another
> case for legacy b.L that can be treated in another patch

I used that setup out of convenience for myself, but AFAICT that use-case
just stresses that issue.

In the end we still have an imbalance value that can vary with only
slight group_capacity changes. And that's true even if that group
contains a single CPU, as the imbalance value computed by
check_asym_packing() can vary by +/-1 with very tiny capacity changes
(rt/IRQ pressure). That means that sometimes you're off by one and don't
do the packing because some CPU was pressured, but some other time you
might do it because that CPU was *more* pressured. It's doesn't make sense.

In the end problem is that in update_sg_lb_stats() we do:

sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) / sgs->group_capacity;

and in check_asym_packing() we do:

env->imbalance = DIV_ROUND_CLOSEST(
sds->busiest_stat.avg_load * sds->busiest_stat.group_capacity,

So we end up with something like:

group_load * SCHED_CAPACITY_SCALE * group_capacity
imbalance = --------------------------------------------------
group_capacity * SCHED_CAPACITY_SCALE

Which you'd hope to reduce down to:

imbalance = group_load

but you get division errors all over the place. So actually, the fix we'd
want would be:

@@ -8463,9 +8463,7 @@ static int check_asym_packing(struct lb_env *env, struct sd_lb_stats *sds)
if (sched_asym_prefer(busiest_cpu, env->dst_cpu))
return 0;

- env->imbalance = DIV_ROUND_CLOSEST(
- sds->busiest_stat.avg_load * sds->busiest_stat.group_capacity,
+ env->imbalance = sds->busiest_stat.group_load;

return 1;