[patch 4/5] sched: Delay task stack freeing on RT

From: Thomas Gleixner
Date: Tue Sep 28 2021 - 08:24:49 EST


From: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>

Anything which is done on behalf of a dead task at the end of
finish_task_switch() is preventing the incoming task from doing useful
work. While it is benefitial for fork heavy workloads to recycle the task
stack quickly, this is a latency source for real-time tasks.

Therefore delay the stack cleanup on RT enabled kernels.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
kernel/exit.c | 5 +++++
kernel/fork.c | 5 ++++-
kernel/sched/core.c | 8 ++++++--
3 files changed, 15 insertions(+), 3 deletions(-)

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -172,6 +172,11 @@ static void delayed_put_task_struct(stru
kprobe_flush_task(tsk);
perf_event_delayed_put(tsk);
trace_sched_process_free(tsk);
+
+ /* RT enabled kernels delay freeing the VMAP'ed task stack */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT))
+ put_task_stack(tsk);
+
put_task_struct(tsk);
}

--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -289,7 +289,10 @@ static inline void free_thread_stack(str
return;
}

- vfree_atomic(tsk->stack);
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+ vfree_atomic(tsk->stack);
+ else
+ vfree(tsk->stack);
return;
}
#endif
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4846,8 +4846,12 @@ static struct rq *finish_task_switch(str
if (prev->sched_class->task_dead)
prev->sched_class->task_dead(prev);

- /* Task is done with its stack. */
- put_task_stack(prev);
+ /*
+ * Release VMAP'ed task stack immediate for reuse. On RT
+ * enabled kernels this is delayed for latency reasons.
+ */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+ put_task_stack(prev);

put_task_struct_rcu_user(prev);
}