Re: [PATCH v4] coredump: Add /proc/<pid>/coredump_pre_exit for pre-exit before dumping
From: Pedro Falcato
Date: Thu Jun 25 2026 - 07:18:56 EST
On Thu, Jun 25, 2026 at 12:57:02PM +0200, David Hildenbrand (Arm) wrote:
> >> +
> >> #define F_DUPFD 0 /* dup */
> >> #define F_GETFD 1 /* get close_on_exec */
> >> #define F_SETFD 2 /* set/clear close_on_exec */
> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> index a679b2448234..84f1ee7f32cf 100644
> >> --- a/kernel/fork.c
> >> +++ b/kernel/fork.c
> >> @@ -1030,6 +1030,18 @@ static int __init coredump_filter_setup(char *s)
> >>
> >> __setup("coredump_filter=", coredump_filter_setup);
> >>
> >> +static unsigned long default_dump_pre_exit;
> >> +
> >> +static int __init coredump_pre_exit_setup(char *s)
> >> +{
> >> + default_dump_pre_exit =
> >> + (simple_strtoul(s, NULL, 0) << MMF_DUMP_PRE_EXIT_SHIFT) &
> >> + MMF_DUMP_PRE_EXIT_MASK;
> >> + return 1;
> >> +}
> >> +
> >> +__setup("coredump_pre_exit=", coredump_pre_exit_setup);
> >
> > This makes no sense. I think you really need to sit down and think about
> > a design for this that doesn't introduce state machinery for boot, mm,
> > and the VFS in one shot to solve a fringe problem...
>
> Staring at exit_mmap_mapped_shared(), ... this looks rather hacky ("let's fake
> munmap and set some magical flags").
>
> We're essentially saying "we don't want (pretty much) anything that's MAP_SHARED
> in the coredump". And for some reason someone should configure that, that's a
> rather weird toggle tbh.
>
> And the granularity ("file-backed shared memory") is completely odd.
>
>
> Aren't there other ways we could optimize this internally?
>
> Like, if we know that a process is dead and cannot run anymore, downgrade writes
> to reads (and make sure we block GUP write attempts accordingly), or would that
> also not be sufficient?
>
>
> Another thought:
>
> fs/coredump.c calls get_dump_page().
>
> get_dump_page() will not fault in any memory. So if a page is not in the page
> tables at the time of the dump, it will not get included in the coredump. Which
> means, that whether most non-anonymous memory will be included in a coredump is
> already like playing the lottery.
>
> This is true for MAP_SHARED file mappings and MAP_PRIVATE file mappings without
> private modifications.
>
> Which makes me wonder: How much is tooling relying on file-backed pages to end
> up in a coredump?
FWIW this mechanism already exists, see /proc/self/coredump_filter. The
default is bits 0, 1, 4 and 5 (see core(5)), which maps back to no file pages
being dumped to a core dump, apart from ELF headers (these help the debugger
trace back the mapped binary to the debug info using the buildid).
So the answer to this question is "approximately none" :)
--
Pedro