[PATCH 0/5] Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there

From: Jann Horn
Date: Mon Apr 27 2020 - 23:28:01 EST


At the moment, we have that rather ugly mmget_still_valid() helper to
work around <https://crbug.com/project-zero/1790>: ELF core dumping
doesn't take the mmap_sem while traversing the task's VMAs, and if
anything (like userfaultfd) then remotely messes with the VMA tree,
fireworks ensue. So at the moment we use mmget_still_valid() to bail
out in any writers that might be operating on a remote mm's VMAs.

With this series, I'm trying to get rid of the need for that as
cleanly as possible.
In particular, I want to avoid holding the mmap_sem across unbounded
sleeps.


Patches 1, 2 and 3 are relatively unrelated cleanups in the core
dumping code.

Patches 4 and 5 implement the main change: Instead of repeatedly
accessing the VMA list with sleeps in between, we snapshot it at the
start with proper locking, and then later we just use our copy of
the VMA list. This ensures that the kernel won't crash, that VMA
metadata in the coredump is consistent even in the presence of
concurrent modifications, and that any virtual addresses that aren't
being concurrently modified have their contents show up in the core
dump properly.

The disadvantage of this approach is that we need a bit more memory
during core dumping for storing metadata about all VMAs.

After this series has landed, we should be able to rip out
mmget_still_valid().


Testing done so far:

- Creating a simple core dump on X86-64 still works.
- The created coredump on X86-64 opens in GDB, and both the stack and the
exectutable look vaguely plausible.
- 32-bit ARM compiles with FDPIC support, both with MMU and !MMU config.

I'm CCing some folks from the architectures that use FDPIC in case
anyone wants to give this a spin.


This series is based on
<https://lore.kernel.org/linux-fsdevel/20200427200626.1622060-1-hch@xxxxxx/>
(Christoph Hellwig's "remove set_fs calls from the coredump code v4").

Jann Horn (5):
binfmt_elf_fdpic: Stop using dump_emit() on user pointers on !MMU
coredump: Fix handling of partial writes in dump_emit()
coredump: Refactor page range dumping into common helper
binfmt_elf, binfmt_elf_fdpic: Use a VMA list snapshot
mm/gup: Take mmap_sem in get_dump_page()

fs/binfmt_elf.c | 170 ++++++++++++---------------------------
fs/binfmt_elf_fdpic.c | 106 +++++++++---------------
fs/coredump.c | 102 +++++++++++++++++++++++
include/linux/coredump.h | 12 +++
mm/gup.c | 69 +++++++++-------
5 files changed, 243 insertions(+), 216 deletions(-)


base-commit: 6a8b55ed4056ea5559ebe4f6a4b247f627870d4c
prerequisite-patch-id: c0a20b414eebc48fe0a8ca570b05de34c7980396
prerequisite-patch-id: 51973b8db0fa4b114e0c3fd8936b634d9d5061c5
prerequisite-patch-id: 0e1e8de282ca6d458dc6cbdc6b6ec5879edd8a05
prerequisite-patch-id: d5ee749c4d3a22ec80bd0dd88aadf89aeb569db8
prerequisite-patch-id: 46ce14e59e98e212a1eca0aef69c6dcdb62b8242
--
2.26.2.303.gf8c07b1a785-goog