Re: [PATCH v2] sched/fair: update scale invariance of PELT

From: Dietmar Eggemann
Date: Fri Apr 28 2017 - 13:08:59 EST


On 28/04/17 16:52, Morten Rasmussen wrote:
> Hi Vincent,

[...]

> As mentioned above, waiting time, i.e. !running && weight, is not
> scaled, which causes trouble for load.

I ran some rt-app-based tests on a system with frequency and cpu invariance.

(1) Two periodic 20% tasks with 12ms period on a cpu (capacity=1024) at
625Mhz (max 1100Mhz) starting to run at the same time, so one task
(task1) is wakeup preempted. (I'm only considering the phase of the test
run where this is a stable condition, i.e. task1 is always wakeup
preempted by task2).

So the runtime of a task is 0.2*12ms*1100/625 = 4.2ms.

At the beginning of the preemption period, __update_load_avg_se(task1)
is called with running=0 and weight=0, at the end with running=0 and
weight=1024.

When task1 finally runs there are two calls with (running=1,
weight=1024) before the next wakeup preemption period for task1 starts
again with (running=0, weight=0).

Task task2 which doesn't suffer from wakeup preemption starts running
with (running=0, weight=0), then there are 2 calls with (running=1,
weight=1024) before it starts running again with (running=0, weight=0).

Task1 is runnable for 8.4ms and sleeps for 3.6ms whereas task is
runnable for 4.2ms and sleeps for 7.8ms.

The load signal of task1 is ~600 whereas the the load of task2 is ~200.

(2) Two periodic 20% tasks with 12ms period on a cpu (capacity=1024) at
1100Mhz (max 1100Mhz) starting to run at the same time, so one task
(task1) is wakeup preempted.

So the runtime of one task is 0.2*12ms*1100/1100 = 2.4ms.

Task1 is runnable for 4.8ms and sleeps for 7.2ms whereas task is
runnable for 2.4ms and sleeps for 9.6ms.

The load signal of task1 is ~400 whereas the the load of task2 is ~200.

Like Morten said, the scaling for load works differently on different
OPP's. Scaling for utilization looks fine.

IMHO, the implementation of your scale_time() function can't take
preemption into consideration.

I also did tests comparing the time_scaling implementation with tip
(contribution scaling) (two periodic tasks 20%/16ms at 625Mhz/1100Mhz
and 20%/32ms at 625Mhz/1100Mhz) showing this as a difference between
time_scaling and tip.

-- Dietmar

[...]