[RFC PATCH 0/4] sched: Introduce cfs_migration

From: Yafang Shao
Date: Thu Nov 04 2021 - 10:57:36 EST


The active load balance has a known issue[1][2] that there is a race
window between waking up the migration thread on the busiest CPU and it
begins to preempt the current running CFS task. This race window may cause
unexpected behavior that the current running CFS task may be preempted
by a RT task first, and then the RT task will be preempted by this
waked migration thread. Per our tracing, the latency caused by this
preemption can be greater than 1ms, which is not a small latency for the
RT tasks.

We'd better set a proper priority to this balance work so that it can
preempt CFS task only. A new per-cpu thread cfs_migration is introduced
for this purpose. The cfs_migration thread has a priority FIFO-1,
which means it can preempt any cfs tasks but can't preempt other FIFO
tasks.

Besides the active load balance work, the numa balance work also applies
to CFS tasks only. So we'd better assign cfs_migraion to numa balance
work as well.

[1]. https://lore.kernel.org/lkml/CAKfTPtBygNcVewbb0GQOP5xxO96am3YeTZNP5dK9BxKHJJAL-g@xxxxxxxxxxxxxx/
[2]. https://lore.kernel.org/lkml/20210615121551.31138-1-laoar.shao@xxxxxxxxx/

Yafang Shao (4):
stop_machine: Move cpu_stop_done into stop_machine.h
sched/fair: Introduce cfs_migration
sched/fair: Do active load balance in cfs_migration
sched/core: Do numa balance in cfs_migration

include/linux/stop_machine.h | 12 +++
kernel/sched/core.c | 2 +-
kernel/sched/fair.c | 143 ++++++++++++++++++++++++++++++++++-
kernel/sched/sched.h | 2 +
kernel/stop_machine.c | 14 +---
5 files changed, 158 insertions(+), 15 deletions(-)

--
2.17.1