[PATCH 0/4] Reduce scheduler migrations due to wake_affine

From: Mel Gorman
Date: Mon Dec 18 2017 - 04:43:46 EST

wake_affine has the impossible task of figuring out when it's best for a
waker to pull a wakee towards the wakers CPU on the expectation that data
locality will offset the migration. It's hurt by the fact that most wakeups
cannot run on the current CPU to avoid stacking multiple tasks on one CPU
by accident so it depends heavily on topology and which CPU nearby is idle.
This series special cases some wake_affine decisions.

Patch 1 was already posted but is a pre-requisite for the other patches. It
avoids wake_affine pulling a task to a different node if the wakeup
source is an interrupt. This is on the basis that we have little
knowledge of whather the CPU servicing the interrupt is relevant
to the data locality of the task being woken. The data from the
interrupt itself may be a tiny proportion of the tasks working

Patch 2 notes that a previous CPU that is idle and cache affine with
the waker is probably a suitable idle sibling and that a search
in select_idle_sibling can be avoided.

Patch 3 just adds a comment for someone who doesn't know the history of
sync wakeups

Patch 4 special cases kworkers that run on a specific CPU as they can have
a synchronous relationship between waker and wakee

The changelog includes some data but results would also be highly machine
specific. For example, I noted a relatively small improvement from patch
1 while Mike Galbraith reported a significant gain on a different machine
for the same workload. YMMV.

kernel/sched/fair.c | 108 +++++++++++++++++++++++++++++++++++++++---------
kernel/sched/features.h | 8 ++++
2 files changed, 96 insertions(+), 20 deletions(-)