Re: [PATCH] sched/fair: reduce preemption with IDLE tasks runable(Internet mail)

From: benbjiang(蒋彪)
Date: Tue Aug 11 2020 - 23:19:50 EST


Hi,

> On Aug 11, 2020, at 11:54 PM, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>
> On 11/08/2020 02:41, benbjiang(蒋彪) wrote:
>> Hi,
>>
>>> On Aug 10, 2020, at 9:24 PM, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>
>>> On 06/08/2020 17:52, benbjiang(蒋彪) wrote:
>>>> Hi,
>>>>
>>>>> On Aug 6, 2020, at 9:29 PM, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>>>
>>>>> On 03/08/2020 13:26, benbjiang(蒋彪) wrote:
>>>>>>
>>>>>>
>>>>>>> On Aug 3, 2020, at 4:16 PM, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>>>>>
>>>>>>> On 01/08/2020 04:32, Jiang Biao wrote:
>>>>>>>> From: Jiang Biao <benbjiang@xxxxxxxxxxx>
>
> [...]
>
>>> Because of this very small weight (weight=3), compared to a SCHED_NORMAL
>>> nice 0 task (weight=1024), a SCHED_IDLE task is penalized by a huge
>>> se->vruntime value (1024/3 higher than for a SCHED_NORMAL nice 0 task).
>>> This should make sure it doesn't tick preempt a SCHED_NORMAL nice 0 task.
>> Could you please explain how the huge penalization of vruntime(1024/3) could
>> make sure SCHED_IDLE not tick preempting SCHED_NORMAL nice 0 task?
>>
>> Thanks a lot.
>
> Trace a run of 2 SCHED_OTHER (nice 0) tasks and 1 SCHED_IDLE task on a
> single CPU and trace_printk the conditions 'if (delta < 0)' and ' if
> (delta > ideal_runtime)' in check_preempt_tick().
>
> Then do the same with 3 SCHED_OTHER (nice 0) tasks. You can also change
> the niceness of the 2 SCHED_OTHER task to 19 to see some differences in
> the kernelshark's task layout.
>
> rt-app (https://github.com/scheduler-tools/rt-app) is a nice tool to
> craft those artificial use cases.
With rt-app tool, sched_switch traced by ftrace, the result is as what I expected,

** 1normal+1idle: idle preempt normal every 200ms **
<...>-92016 [002] d... 2398066.902477: sched_switch: prev_comm=normal0-0 prev_pid=92016 prev_prio=120 prev_state=S ==> next_comm=idle0-0 next_pid=91814 next_prio=120
<...>-91814 [002] d... 2398066.902527: sched_switch: prev_comm=idle0-0 prev_pid=91814 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=92016 next_prio=120
<...>-92016 [002] d... 2398066.922472: sched_switch: prev_comm=normal0-0 prev_pid=92016 prev_prio=120 prev_state=S ==> next_comm=idle0-0 next_pid=91814 next_prio=120
<...>-91814 [002] d... 2398066.922522: sched_switch: prev_comm=idle0-0 prev_pid=91814 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=92016 next_prio=120
<...>-92016 [002] d... 2398066.942292: sched_switch: prev_comm=normal0-0 prev_pid=92016 prev_prio=120 prev_state=S ==> next_comm=idle0-0 next_pid=91814 next_prio=120
<...>-91814 [002] d... 2398066.942343: sched_switch: prev_comm=idle0-0 prev_pid=91814 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=92016 next_prio=120
<...>-92016 [002] d... 2398066.962331: sched_switch: prev_comm=normal0-0 prev_pid=92016 prev_prio=120 prev_state=S ==> next_comm=idle0-0 next_pid=91814 next_prio=120

** 2normal+1idle: idle preempt normal every 600+ms **
<...>-49009 [002] d... 2400562.746640: sched_switch: prev_comm=normal0-0 prev_pid=49009 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2400562.747502: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=S ==> next_comm=normal1-0 next_pid=198658 next_prio=120
<...>-198658 [002] d... 2400563.335262: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2400563.336258: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=49009 next_prio=120
<...>-198658 [002] d... 2400564.017663: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2400564.018661: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=49009 next_prio=120
<...>-198658 [002] d... 2400564.701063: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120

** 3normal+idle: idle preempt normal every 1000+ms **
<...>-198658 [002] d... 2400415.780701: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2400415.781699: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal2-0 next_pid=46478 next_prio=120
<...>-49009 [002] d... 2400416.806298: sched_switch: prev_comm=normal0-0 prev_pid=49009 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2400416.807297: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal2-0 next_pid=46478 next_prio=120
<...>-198658 [002] d... 2400417.826910: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2400417.827911: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal2-0 next_pid=46478 next_prio=120
<...>-49009 [002] d... 2400418.857497: sched_switch: prev_comm=normal0-0 prev_pid=49009 prev_prio=120 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120

** 2normal(nice 19)+1idle(nice 0): idle preempt normal every 30+ms **
<...>-187466 [002] d... 2401740.134249: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=49009 next_prio=139
<...>-198658 [002] d... 2401740.162182: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=139 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2401740.165177: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=49009 next_prio=139
<...>-49009 [002] d... 2401740.193110: sched_switch: prev_comm=normal0-0 prev_pid=49009 prev_prio=139 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2401740.196104: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal1-0 next_pid=198658 next_prio=139
<...>-198658 [002] d... 2401740.228029: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=139 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120
<...>-187466 [002] d... 2401740.231022: sched_switch: prev_comm=idle0-0 prev_pid=187466 prev_prio=120 prev_state=R ==> next_comm=normal0-0 next_pid=49009 next_prio=139
<...>-198658 [002] d... 2401740.262946: sched_switch: prev_comm=normal1-0 prev_pid=198658 prev_prio=139 prev_state=R ==> next_comm=idle0-0 next_pid=187466 next_prio=120

SCHED_IDLE tasks do tick preempt rarely, but can not be avoided with a weight.

I wonder if the result is what you expected? :)

Thanks a lot.
Regards,
Jiang

>
> [...]