Re: [PATCH v4] coredump: Add /proc/<pid>/coredump_pre_exit for pre-exit before dumping
From: Christian Brauner
Date: Mon Jun 29 2026 - 03:16:31 EST
On 2026-06-25 13:43 +0200, David Hildenbrand (Arm) wrote:
> On 6/25/26 13:18, Pedro Falcato wrote:
> > On Thu, Jun 25, 2026 at 12:57:02PM +0200, David Hildenbrand (Arm) wrote:
> >>>
> >>> This makes no sense. I think you really need to sit down and think about
> >>> a design for this that doesn't introduce state machinery for boot, mm,
> >>> and the VFS in one shot to solve a fringe problem...
> >>
> >> Staring at exit_mmap_mapped_shared(), ... this looks rather hacky ("let's fake
> >> munmap and set some magical flags").
> >>
> >> We're essentially saying "we don't want (pretty much) anything that's MAP_SHARED
> >> in the coredump". And for some reason someone should configure that, that's a
> >> rather weird toggle tbh.
> >>
> >> And the granularity ("file-backed shared memory") is completely odd.
> >>
> >>
> >> Aren't there other ways we could optimize this internally?
> >>
> >> Like, if we know that a process is dead and cannot run anymore, downgrade writes
> >> to reads (and make sure we block GUP write attempts accordingly), or would that
> >> also not be sufficient?
> >>
> >>
> >> Another thought:
> >>
> >> fs/coredump.c calls get_dump_page().
> >>
> >> get_dump_page() will not fault in any memory. So if a page is not in the page
> >> tables at the time of the dump, it will not get included in the coredump. Which
> >> means, that whether most non-anonymous memory will be included in a coredump is
> >> already like playing the lottery.
> >>
> >> This is true for MAP_SHARED file mappings and MAP_PRIVATE file mappings without
> >> private modifications.
> >>
> >> Which makes me wonder: How much is tooling relying on file-backed pages to end
> >> up in a coredump?
> >
> > FWIW this mechanism already exists, see /proc/self/coredump_filter. The
> > default is bits 0, 1, 4 and 5 (see core(5)), which maps back to no file pages
> > being dumped to a core dump, apart from ELF headers (these help the debugger
> > trace back the mapped binary to the debug info using the buildid).
> >
> > So the answer to this question is "approximately none" :)
> >
>
> Ah, thanks! vma_dump_size() honors this, and I am sure through some magical
> routing the information stored in m->dump_size will end up not dumping these pages.
>
> Staring at elf_core_dump(), this "unmap some stuff" part is really, really
> nasty, as it effectively removes the VMAs->segments from the dump. (unless I am
> missing something important)
I would not mind if you would clean the mm portion of coredump up as I
think it's pretty ancient and - as you can see - rather questionable.
Fwiw, for the coredump socket I added a "userspace" mode a little while
ago where the kernel doesn't actually generate the coredump at all but
let's userspace do it. Android has the same model but uses signal
handler tricks (iirc) to do this.