[PATCH v2 0/4] sched/fair: Active balancer RT/DL preemption fix
From: Valentin Schneider
Date: Thu Aug 15 2019 - 10:51:54 EST
Vincent's load balance rework [1] got me thinking about how and where we
use rq.nr_running vs rq.cfs.h_nr_running checks, and this lead me to
stare intently at the active load balancer.
I haven't seen it happen (yet), but from reading the code it really looks
like we can have some scenarios where the cpu_stopper ends up preempting
a > CFS class task, since we never actually look at what's the remote rq's
running task.
This series shuffles things around the CFS active load balancer to prevent
this from happening.
- Patch 1 is a freebie cleanup
- Patch 2 is a preparatory code move
- Patch 3 adds h_nr_running checks
- Patch 4 adds a sched class check + detach_one_task() to the active balance
This is based on top of today's tip/sched/core:
a46d14eca7b7 ("sched/fair: Use rq_lock/unlock in online_fair_sched_group")
v1 -> v2:
- (new patch) Added need_active_balance() cleanup
- Tweaked active balance code move to respect existing
sd->nr_balance_failed modifications
- Added explicit checks of active_load_balance()'s return value
- Added an h_nr_running < 1 check before kicking the cpu_stopper
- Added a detach_one_task() call in active_load_balance() when the remote
rq's running task is > CFS
[1]: https://lore.kernel.org/lkml/1564670424-26023-1-git-send-email-vincent.guittot@xxxxxxxxxx/
Valentin Schneider (4):
sched/fair: Make need_active_balance() return bools
sched/fair: Move active balance logic to its own function
sched/fair: Check for CFS tasks before detach_one_task()
sched/fair: Prevent active LB from preempting higher sched classes
kernel/sched/fair.c | 151 ++++++++++++++++++++++++++++----------------
1 file changed, 95 insertions(+), 56 deletions(-)
--
2.22.0