Re: [RFC PATCH 3/6] sched/dl: Try better placement even for deadline tasks that do not block

From: Dietmar Eggemann
Date: Thu Jul 11 2019 - 11:33:10 EST


On 7/11/19 2:00 PM, Peter Zijlstra wrote:
> On Thu, Jul 11, 2019 at 01:17:17PM +0200, Dietmar Eggemann wrote:
>> On 7/9/19 3:42 PM, Peter Zijlstra wrote:
>
>>>>> That is, we only do those callbacks from:
>>>>>
>>>>> schedule_tail()
>>>>> __schedule()
>>>>> rt_mutex_setprio()
>>>>> __sched_setscheduler()
>>>>>
>>>>> and the above looks like it can happen outside of those.
>
>> Is this what you are concerned about?
>>
>> (2 Cpus (CPU1, CPU2), 4 deadline task (thread0-X)) with
>>
>> @@ -1137,6 +1137,13 @@ static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf)
>> rf->cookie = lockdep_pin_lock(&rq->lock);
>>
>> #ifdef CONFIG_SCHED_DEBUG
>> +#ifdef CONFIG_SMP
>> + /*
>> + * There should not be pending callbacks at the start of rq_lock();
>> + * all sites that handle them flush them at the end.
>> + */
>> + WARN_ON_ONCE(rq->balance_callback);
>> +#endif
>>
>>
>> [ 87.251237] *** <--- queue_balance_callback(migrate_dl_task) p=[thread0-3 3627] on CPU2
>> [ 87.251261] WARNING: CPU: 2 PID: 3627 at kernel/sched/sched.h:1145 __schedule+0x56c/0x690
>> [ 87.615882] WARNING: CPU: 2 PID: 3616 at kernel/sched/sched.h:1145 task_rq_lock+0xe8/0xf0
>> [ 88.176844] WARNING: CPU: 2 PID: 3616 at kernel/sched/sched.h:1145 load_balance+0x4d0/0xbc0
>> [ 88.381905] WARNING: CPU: 2 PID: 3616 at kernel/sched/sched.h:1145 load_balance+0x7d8/0xbc0
>
> I'm not sure how we get 4 warns, I was thinking that as soon as we exit
> __schedule() we'd procress the callback so further warns would be
> avoided.

Reducing the warning to only fire on CPU1 I got another test-run:

[ 6688.373607] *** <--- queue_balance_callback(migrate_dl_task) p=[thread0-3 4343] on CPU1
[ 6688.381557] WARNING: CPU: 1 PID: 4343 at kernel/sched/sched.h:1146 try_to_wake_up+0x614/0x788
...
[ 6688.505000] try_to_wake_up+0x614/0x788
[ 6688.508794] default_wake_function+0x34/0x48
[ 6688.513017] autoremove_wake_function+0x3c/0x68
[ 6688.517497] __wake_up_common+0x90/0x158
[ 6688.521374] __wake_up_common_lock+0x88/0xd0
[ 6688.525595] __wake_up+0x40/0x50
[ 6688.528787] wake_up_klogd_work_func+0x4c/0x88
[ 6688.533184] irq_work_run_list+0x8c/0xd8
[ 6688.537063] irq_work_tick+0x48/0x60
[ 6688.540598] update_process_times+0x44/0x60
[ 6688.544735] tick_sched_handle.isra.5+0x44/0x68
[ 6688.549215] tick_sched_timer+0x50/0xa0
[ 6688.553007] __hrtimer_run_queues+0x11c/0x3d0
[ 6688.557316] hrtimer_interrupt+0xd8/0x248
[ 6688.561282] arch_timer_handler_phys+0x38/0x58
[ 6688.565678] handle_percpu_devid_irq+0x90/0x2b8
[ 6688.570160] generic_handle_irq+0x34/0x50
[ 6688.574124] __handle_domain_irq+0x68/0xc0
[ 6688.578175] gic_handle_irq+0x60/0xb0
...
[ 6688.589909] WARNING: CPU: 1 PID: 4343 at kernel/sched/sched.h:1146 scheduler_tick+0xe8/0x128
...
[ 6688.714463] scheduler_tick+0xe8/0x128
[ 6688.718170] update_process_times+0x48/0x60
[ 6688.722306] tick_sched_handle.isra.5+0x44/0x68
[ 6688.726786] tick_sched_timer+0x50/0xa0
[ 6688.730579] __hrtimer_run_queues+0x11c/0x3d0
[ 6688.734887] hrtimer_interrupt+0xd8/0x248
[ 6688.738852] arch_timer_handler_phys+0x38/0x58
[ 6688.743246] handle_percpu_devid_irq+0x90/0x2b8
[ 6688.747727] generic_handle_irq+0x34/0x50
[ 6688.751692] __handle_domain_irq+0x68/0xc0
[ 6688.755741] gic_handle_irq+0x60/0xb0
...
[ 6688.767476] WARNING: CPU: 1 PID: 4343 at kernel/sched/sched.h:1146 task_rq_lock+0xc0/0x100
...
[ 6688.891511] task_rq_lock+0xc0/0x100
[ 6688.895046] dl_task_timer+0x48/0x2c8
[ 6688.898666] __hrtimer_run_queues+0x11c/0x3d0
[ 6688.902975] hrtimer_interrupt+0xd8/0x248
[ 6688.906939] arch_timer_handler_phys+0x38/0x58
[ 6688.911334] handle_percpu_devid_irq+0x90/0x2b8
[ 6688.915815] generic_handle_irq+0x34/0x50
[ 6688.919779] __handle_domain_irq+0x68/0xc0
[ 6688.923828] gic_handle_irq+0x60/0xb0
...
[ 6688.944618] WARNING: CPU: 1 PID: 4343 at kernel/sched/sched.h:1146 update_blocked_averages+0x84c/0x9a0
...
[ 6689.071664] update_blocked_averages+0x84c/0x9a0
[ 6689.076231] run_rebalance_domains+0x74/0xb0
[ 6689.080452] __do_softirq+0x154/0x3f0
[ 6689.084074] irq_exit+0xf0/0xf8
[ 6689.087178] __handle_domain_irq+0x6c/0xc0
[ 6689.091228] gic_handle_irq+0x60/0xb0
...
[ 6689.303143] WARNING: CPU: 1 PID: 4343 at kernel/sched/sched.h:1146 __schedule+0x478/0x698
...
[ 6689.440861] __schedule+0x478/0x698
[ 6689.444310] schedule+0x38/0xc0
[ 6689.447416] do_notify_resume+0x88/0x380
[ 6689.451294] work_pending+0x8/0x14
...
[ 6689.459256] *** ---> migrate_dl_task() p=[thread0-3 4343] to CPU-1

>
>> [ 88.586991] *** ---> migrate_dl_task() p=[thread0-3 3627] to CPU1
>
> But yes, something like this. Basucally I want to avoid calling
> queue_balance_callback() from a context where we'll not follow up with
> balance_callback().

Understood.