Re: [PATCH v3] coredump: exit_files() in coredump_wait() if MMF_DUMP_MAPPED_SHARED is not set
From: Christian Brauner
Date: Fri Jun 19 2026 - 09:54:59 EST
> A coredump typically takes some time to complete. If we happen to hold a
> write lock with flock just before triggering the coredump, that write lock
> will not be released during the entire coredump process. As a result,
> other processes attempting to acquire the same write lock may experience
> significant delays.
>
> To address this, call exit_files() in the end of coredump_wait(), if
> MMF_DUMP_MAPPED_SHARED is not set. Note that early unlocking a flock on a
> file allows other processes to lock and modify the mapped data protected
> by the flock.
>
> Signed-off-by: Xin Zhao <jackzxcui1989@xxxxxxx>
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index bb6fdb1f458e..70698d06ec9d 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -548,6 +548,13 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
> }
> }
>
> + /*
> + * Early unlocking a flock on a file allows other processes
> + * to lock and modify the mapped data protected by the flock.
> + */
> + if (!mm_flags_test(MMF_DUMP_MAPPED_SHARED, tsk->mm))
> + exit_files(tsk);
This doesn't work - at least not unconditionally. Tools like
systemd-coredump or apport go through the fds. Specifically
systemd-coredump does:
1) /proc/[pid]/fd/ — opendir() then, per entry, readlinkat() to get the symlink target.
2) /proc/[pid]/fdinfo/ — for each fd it reads the fdinfo text lines
The blob is attached to the journal record as the COREDUMP_OPEN_FDS=
field. So the open-fd list is recorded as metadata, retrievable later
(e.g. coredumpctl info shows it).
Also, irc some clever implementations use pidfd_getfd() to preserve the
files from a coredumping process to preserve them.
So you break all that - and only in some of the cases which is really
opaque to userspace. That's not acceptable. If you only care about the
case where you dump to a file then either special-case it to the legacy
file coredump format or if it's generally useful make it an optional
argument that can be passed to the coredump pipe and a new flag
extension to the coredump socket that makes the coredumping process shed
it's file descriptors.
--
Christian Brauner <brauner@xxxxxxxxxx>