[PATCH misc 2/2] exit: Sleep at TASK_IDLE when waiting for application core dump
From: Paul E. McKenney
Date: Thu Aug 01 2024 - 20:30:56 EST
Currently, the coredump_task_exit() function sets the task state
to TASK_UNINTERRUPTIBLE|TASK_FREEZABLE, which usually works well.
But a combination of large memory and slow (and/or highly contended)
mass storage can cause application core dumps to take more than
two minutes, which can cause check_hung_task(), which is invoked by
check_hung_uninterruptible_tasks(), to produce task-blocked splats.
There does not seem to be any reasonable benefit to getting these splats.
Furthermore, as Oleg Nesterov points out, TASK_UNINTERRUPTIBLE could
be misleading because the task sleeping in coredump_task_exit() really
is killable, albeit indirectly. See the check of signal->core_state
in prepare_signal() and the check of fatal_signal_pending()
in dump_interrupted(), which bypass the normal unkillability of
TASK_UNINTERRUPTIBLE, resulting in coredump_finish() invoking
wake_up_process() on any threads sleeping in coredump_task_exit().
Therefore, change that TASK_UNINTERRUPTIBLE to TASK_IDLE.
Reported-by: Anhad Jai Singh <ffledgling@xxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Cc: Christian Brauner <brauner@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx>
Cc: Chris Mason <clm@xxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxxx>
---
kernel/exit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index 7430852a85712..0d62a53605dfc 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -428,7 +428,7 @@ static void coredump_task_exit(struct task_struct *tsk)
complete(&core_state->startup);
for (;;) {
- set_current_state(TASK_UNINTERRUPTIBLE|TASK_FREEZABLE);
+ set_current_state(TASK_IDLE|TASK_FREEZABLE);
if (!self.task) /* see coredump_finish() */
break;
schedule();
--
2.40.1