Re: [PATCH v5 09/14] sched: Add over-utilization/tipping point indicator

From: Vincent Guittot
Date: Mon Aug 06 2018 - 06:45:58 EST


On Mon, 6 Aug 2018 at 11:43, Quentin Perret <quentin.perret@xxxxxxx> wrote:
>
> On Monday 06 Aug 2018 at 10:40:46 (+0200), Vincent Guittot wrote:
> > On Fri, 3 Aug 2018 at 17:55, Quentin Perret <quentin.perret@xxxxxxx> wrote:
> > For every new task, the cpu selection is done assuming it's a heavy
> > task with the max possible load_avg, and it looks for the idlest cpu.
> > This means that if the system is lightly loaded, scheduler will select
> > most probably a idle big core.
>
> Agreed, that is what should happen if the system is lightly loaded.
> However, I'm still not totally convinced this is wrong. It's
> definitely not _always_ wrong, at least. Just like starting new tasks
> on little CPUs isn't always wrong either.

As explained before, IMHO, this is not wrong if you looks for
performance but this is wrong if you looks for power saving

>
> > selecting big or Little is not the problem here. The problem is that
> > we don't use Energy Model so we will most probably do the wrong
> > choice. Nevertheless, putting a task on big is clearly the wrong
> > choice in the case I mentioned above " shell script on hikey960".
>
> _You_ can say it's wrong because _you_ know the task composition. The
> scheduler has no way to tell. You could come up with a script that
> spawns heavy tasks every once in a while, and in this case putting
> those on big cores would be beneficial ...
>
> > Having something in the middle like taking into account load and/org
> > utilization of the parent in order to mitigate big task starting with
> > small utilization and small task starting with big utilization.
> > It's probably not perfect because big tasks can create small ones and
> > the opposite but if there are already big tasks, assuming that the new
> > one is also a big one should have less power impact as we are already
> > consuming power for the current bigs. At the opposite, if little are
> > running, assuming that new task is little will not harm the power
> > consumption unnecessarily.
>
> Right, we can definitely come up with something more conservative than
> what I'm currently proposing. I had a quick chat with Morten about this
> the other day and one suggestion he had was to pick the CPU with the max
> spare cap in the frequency domain in which the parent task is running ...
>
> In any case, I really feel like there isn't an obvious right decision
> here, so I'd prefer to keep things simple for now. This patch-set is a
> first step, and fine-grained tuning for new tasks is probably something
> that can be done later, if need be. What do you think ?

I would have preferred to have a full power policy for all task when
EAS is in used by default and then see if there is any performance
problem instead of letting some UC unclear but that's a personal
opinion.
so IMO, the minimum is to add a comment in the code that describes
this behavior for fork tasks so people will understand why EAS puts
newly created task on not "EAS friendly" cpus when they will look at
the code trying to understand the behavior

>
> Thanks,
> Quentin