Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small
From: Vincent Guittot
Date: Tue Mar 10 2020 - 03:57:57 EST
On Tue, 10 Mar 2020 at 04:42, çè <yun.wang@xxxxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2020/3/9 äå7:15, Vincent Guittot wrote:
> [snip]
> >>>> - load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
> >>>> + load = max(cfs_rq->load.weight, scale_load(cfs_rq->avg.load_avg));
> >>>>
> >>>> tg_weight = atomic_long_read(&tg->load_avg);
> >>>
> >>> Get the point, but IMHO fix scale_load_down() sounds better, to
> >>> cover all the similar cases, let's first try that way see if it's
> >>> working :-)
> >>
> >> Yeah, that might not be a bad idea as well; it's just that doing this
> >> fix would keep you from losing all your precision (and I'd have to think
> >> if that would result in fairness issues like having all the group ses
> >> having the full tg shares, or something like that).
> >
> > AFAICT, we already have a fairness problem case because
> > scale_load_down is used in calc_delta_fair() so all sched groups that
> > have a weight lower than 1024 will end up with the same increase of
> > their vruntime when running.
> > Then the load_avg is used to balance between rq so load_balance will
> > ensure at least 1 task per CPU but not more because the load_avg which
> > is then used will stay null.
> >
> > That being said, having a min of 2 for scale_load_down will enable us
> > to have the tg->load_avg != 0 so a tg_weight != 0 and each sched group
> > will not have the full shares. But it will make those group completely
> > fair anyway.
> > The best solution would be not to scale down the weight but that's a
> > bigger change
>
> Does that means a changing for all those 'load.weight' related
> calculation, to reserve the scaled weight?
yes, to make sure that calculation still fit in the variable
>
> I suppose u64 is capable for 'cfs_rq.load' to reserve the scaled up load,
> changing all those places could be annoying but still fine.
it's fine but the max number of runnable tasks at the max priority on
a cfs_rq will decrease from around 4 billion to "only" 4 Million.
>
> However, I'm not quite sure about the benefit, how much more precision
> we'll gain and does that really matters? better to have some testing to
> demonstrate it.
it will ensure a better fairness in a larger range of share value. I
agree that we can wonder if it's worth the effort for those low share
values. Wouldbe interesting to knwo who use such low value and for
which purpose
Regards,
Vincent
>
> Regards,
> Michael Wang
>
>
> >