[PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable

From: Kirill Tkhai
Date: Thu Apr 26 2018 - 07:00:40 EST


This function searches for a new mm owner in children and siblings,
and then iterates over all processes in the system in unlikely case.
Despite the case is unlikely, its probability growths with the number
of processes in the system. The time, spent on iterations, also growths.
I regulary observe mm_update_next_owner() in crash dumps (not related
to this function) of the nodes with many processes (20K+), so it looks
like it's not so unlikely case.

The patchset reworks the function locking and makes it to use
rcu_read_lock() for iterations over all tasks in the system. This is
possible, because of task_struct::mm may be inherited by children
processes and threads only (except kernel threads), which are added
to end of tasks list or threads list on fork(). So, the patchset uses
RCU and memory barriers to make race-free traverse over the lists [4/4].
Patches [1-3/4] are preparations for that.

Kirill
---

Kirill Tkhai (4):
exit: Move read_unlock() up in mm_update_next_owner()
exit: Use rcu instead of get_task_struct() in mm_update_next_owner()
exit: Rename assign_new_owner label in mm_update_next_owner()
exit: Lockless iteration over task list in mm_update_next_owner()


kernel/exit.c | 58 ++++++++++++++++++++++++++++++++++++++++++---------------
kernel/fork.c | 1 +
kernel/pid.c | 5 ++++-
3 files changed, 48 insertions(+), 16 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>