Re: [RFC] Documentation/scheduler/schedutil.txt

From: Vincent Guittot
Date: Mon Nov 23 2020 - 08:43:31 EST

Next message: Will Deacon: "[PATCH] PCI: Mark AMD Raven iGPU ATS as broken"
Previous message: Mimi Zohar: "Re: [PATCH v6 0/8] IMA: support for measuring kernel integrity critical data"
In reply to: Dietmar Eggemann: "Re: [RFC] Documentation/scheduler/schedutil.txt"
Next in thread: Valentin Schneider: "Re: [RFC] Documentation/scheduler/schedutil.txt"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 23 Nov 2020 at 12:27, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>
> On 23/11/2020 11:05, Vincent Guittot wrote:
> > On Mon, 23 Nov 2020 at 10:30, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> >>
> >> On 20/11/2020 09:56, Peter Zijlstra wrote:
> >>> On Fri, Nov 20, 2020 at 08:55:27AM +0100, Peter Zijlstra wrote:
> >>>> - In saturated scenarios task movement will cause some transient dips,
> >>>> suppose we have a CPU saturated with 4 tasks, then when we migrate a task
> >>>> to an idle CPU, the old CPU will have a 'running' value of 0.75 while the
> >>>> new CPU will gain 0.25. This is inevitable and time progression will
> >>>> correct this. XXX do we still guarantee f_max due to no idle-time?
> >>>
> >>> Do we want something like this? Is the 1.5 threshold sane? (it's been too
> >>> long since I looked at actual numbers here)
> >>
> >> Did some tests on a big.little system:
> >>
> >> (1) rt-app workload on big CPU:
> >>
> >> - task0-3 (runtime/period=4000us/16000us, started with
> >> 4000us delay to each other) run on CPU1
> >> - then task3 migrates to CPU2 and runs there for 64ms
> >> - then task2 migrates to CPU2 too and both tasks run there
> >> for another 64ms
> >>
> >> ...
> >> task3-3-1684 [001] 3982.798729: sched_pelt_cfs: cpu=1 path=/ load=232890 runnable=3260 util=1011
> >> migration/1-14 [001] 3982.798756: sched_migrate_task: comm=task3-3 pid=1684 prio=101 orig_cpu=1 dest_cpu=2*
> >> migration/1-14 [001] 3982.798767: sched_pelt_cfs: cpu=1 path=/ load=161374 runnable=2263 util=*700* <-- util dip !!!
> >> task1-1-1682 [001] 3982.799802: sched_pelt_cfs: cpu=1 path=/ load=160988 runnable=2257 util=706
> >> ...
> >> task2-2-1683 [001] 3982.849123: sched_pelt_cfs: cpu=1 path=/ load=161124 runnable=2284 util=904
> >> task2-2-1683 [001] 3982.851960: sched_pelt_cfs: cpu=1 path=/ load=160130 runnable=2271 util=911
> >> migration/1-14 [001] 3982.851984: sched_migrate_task: comm=task2-2 pid=1683 prio=101 orig_cpu=1 dest_cpu=2**
> >> migration/1-14 [001] 3982.851995: sched_pelt_cfs: cpu=1 path=/ load=88672 runnable=*1257* util=512 <-- runnable below 1536
> >> task1-1-1682 [001] 3982.852983: sched_pelt_cfs: cpu=1 path=/ load=88321 runnable=1252 util=521
> >> ...
> >>
> >>
> >> * task1,2,3 remain on CPU1 and still have to catch up, no idle
> >> time on CPU1
> >>
> >> ** task 1,2 remain on CPU1, there is idle time on CPU1!
> >>
> >>
> >> (2) rt-app workload on LITTLE CPU (orig cpu_capacity: 446)
> >>
> >> - task0-3 (runtime/period=1742us/16000us, started with
> >> 4000us delay to each other) run on CPU4
> >> - then task3 migrates to CPU5 and runs there for 64ms
> >> - then task2 migrates to CPU5 too and both tasks run there
> >> for another 64ms
> >>
> >> ...
> >> task1-1-1777 [004] 789.443015: sched_pelt_cfs: cpu=4 path=/ load=234718 runnable=3018 util=976
> >> migration/4-29 [004] 789.444718: sched_migrate_task: comm=task3-3 pid=1779 prio=101 orig_cpu=4 dest_cpu=5*
> >> migration/4-29 [004] 789.444739: sched_pelt_cfs: cpu=4 path=/ load=163543 runnable=2114 util=*778* <--util dip !!!
> >> task2-2-1778 [004] 789.447013: sched_pelt_cfs: cpu=4 path=/ load=163392 runnable=2120 util=777
> >> ...
> >> task1-1-1777 [004] 789.507012: sched_pelt_cfs: cpu=4 path=/ load=164482 runnable=2223 util=879
> >> migration/4-29 [004] 789.508023: sched_migrate_task: comm=task2-2 pid=1778 prio=101 orig_cpu=4 dest_cpu=5**
> >> migration/4-29 [004] 789.508044: sched_pelt_cfs: cpu=4 path=/ load=94099 runnable=*1264* util=611 <-- runnable below 1536
> >> task0-0-1776 [004] 789.511011: sched_pelt_cfs: cpu=4 path=/ load=93898 runnable=1264 util=622
> >> ...
> >>
> >> * task1,2,3 remain on CPU1 and still have to catch up, no idle
> >> time on CPU1
> >>
> >> ** task 1,2 remain on CPU1, no idle time on CPU1 yet.
> >>
> >> So for the big CPU, there is idle time and for the LITTLE there
> >> isn't with runnable below the threshold.
> >
> > I'm not sure to catch what you want to highlight with your tests ?
>
> I thought the question was whether 'runnable_avg = 1.5 x
> SCHED_CAPACITY_SCALE' is a good threshold to decide to drive frequency
> by runnable_avg or util_avg.

we can't use SCHED_CAPACITY_SCALE and must use cpu's capacity

>
> [...]

Next message: Will Deacon: "[PATCH] PCI: Mark AMD Raven iGPU ATS as broken"
Previous message: Mimi Zohar: "Re: [PATCH v6 0/8] IMA: support for measuring kernel integrity critical data"
In reply to: Dietmar Eggemann: "Re: [RFC] Documentation/scheduler/schedutil.txt"
Next in thread: Valentin Schneider: "Re: [RFC] Documentation/scheduler/schedutil.txt"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]