Re: [PATCH v2] sched/fair: update scale invariance of PELT
From: Dietmar Eggemann
Date: Fri Apr 28 2017 - 13:08:59 EST
On 28/04/17 16:52, Morten Rasmussen wrote:
> Hi Vincent,
[...]
> As mentioned above, waiting time, i.e. !running && weight, is not
> scaled, which causes trouble for load.
I ran some rt-app-based tests on a system with frequency and cpu invariance.
(1) Two periodic 20% tasks with 12ms period on a cpu (capacity=1024) at
625Mhz (max 1100Mhz) starting to run at the same time, so one task
(task1) is wakeup preempted. (I'm only considering the phase of the test
run where this is a stable condition, i.e. task1 is always wakeup
preempted by task2).
So the runtime of a task is 0.2*12ms*1100/625 = 4.2ms.
At the beginning of the preemption period, __update_load_avg_se(task1)
is called with running=0 and weight=0, at the end with running=0 and
weight=1024.
When task1 finally runs there are two calls with (running=1,
weight=1024) before the next wakeup preemption period for task1 starts
again with (running=0, weight=0).
Task task2 which doesn't suffer from wakeup preemption starts running
with (running=0, weight=0), then there are 2 calls with (running=1,
weight=1024) before it starts running again with (running=0, weight=0).
Task1 is runnable for 8.4ms and sleeps for 3.6ms whereas task is
runnable for 4.2ms and sleeps for 7.8ms.
The load signal of task1 is ~600 whereas the the load of task2 is ~200.
(2) Two periodic 20% tasks with 12ms period on a cpu (capacity=1024) at
1100Mhz (max 1100Mhz) starting to run at the same time, so one task
(task1) is wakeup preempted.
So the runtime of one task is 0.2*12ms*1100/1100 = 2.4ms.
Task1 is runnable for 4.8ms and sleeps for 7.2ms whereas task is
runnable for 2.4ms and sleeps for 9.6ms.
The load signal of task1 is ~400 whereas the the load of task2 is ~200.
Like Morten said, the scaling for load works differently on different
OPP's. Scaling for utilization looks fine.
IMHO, the implementation of your scale_time() function can't take
preemption into consideration.
I also did tests comparing the time_scaling implementation with tip
(contribution scaling) (two periodic tasks 20%/16ms at 625Mhz/1100Mhz
and 20%/32ms at 625Mhz/1100Mhz) showing this as a difference between
time_scaling and tip.
-- Dietmar
[...]