Folks,
Olivier Langlois has been struggling with coredumps getting truncated in
tasks using io_uring. He has also apparently been struggling with
the some of his email messages not making it to the lists.
We were talking about some of his struggles and questions in this area
and he pointed me to this patch he thought he had posted but I could not
find in the list archives.
In short the coredump code deliberately supports being interrupted by
SIGKILL, and depends upon prepare_signal to filter out all other
signals. With the io_uring code comes an extra test in signal_pending
for TIF_NOTIFY_SIGNAL (which is something about asking a task to run
task_work_run).
I am baffled why the dumper thread would be getting interrupted by
TIF_NOTIFY_SIGNAL but apparently it is. Perhaps it is an io_uring
thread that is causing the dump.
Now that we know the problem the question becomes how to fix this issue.
Is there any chance all of this TWA_SIGNAL logic could simply be removed
now that io_uring threads are normal process threads?
There are only the two call sites so I perhaps the could test
signal->flags & SIGNAL_FLAG_COREDUMP before scheduling a work on
a process that is dumping core?
Perhaps the coredump code needs to call task_work_run before it does
anything?
-----
From: Olivier Langlois <olivier@xxxxxxxxxxxxxx>
Subject: [PATCH] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL
Date: Mon, 07 Jun 2021 16:25:06 -0400
io_uring is a big user of task_work and any event that io_uring made a
task waiting for that occurs during the core dump generation will
generate a TIF_NOTIFY_SIGNAL.
Here are the detailed steps of the problem:
1. io_uring calls vfs_poll() to install a task to a file wait queue
with io_async_wake() as the wakeup function cb from io_arm_poll_handler()
2. wakeup function ends up calling task_work_add() with TWA_SIGNAL
3. task_work_add() sets the TIF_NOTIFY_SIGNAL bit by calling
set_notify_signal()
Signed-off-by: Olivier Langlois <olivier@xxxxxxxxxxxxxx>
---
fs/coredump.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/coredump.c b/fs/coredump.c
index 2868e3e171ae..79c6e3f114db 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -519,7 +519,7 @@ static bool dump_interrupted(void)
* but then we need to teach dump_write() to restart and clear
* TIF_SIGPENDING.
*/
- return signal_pending(current);
+ return task_sigpending(current);
}
static void wait_for_dump_helpers(struct file *file)