Re: [RFC][PATCHES] converting FDPIC coredumps to regsets

From: Eric W. Biederman
Date: Tue Jul 14 2020 - 13:17:19 EST


Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:

> Conversion of ELF coredumps to regsets has not touched
> ELF_FDPIC. Right now all architectures that support FDPIC have
> regsets sufficient for switching it to regset-based coredumps. A bit
> of backstory: original ELF (and ELF_FDPIC) coredumps reused the old
> helpers used by a.out coredumps. These days a.out coredumps are gone;
> we could remove the dead code, if not for several obstacles. And one
> of those obstacles is ELF_FDPIC.
>
> This series more or less reproduces the conversion done
> by Roland for ELF coredumps. The branch is in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.fdpic
> and it's based on top of #regset.base there (just the introduction of
> regset_get() wrapper for ->get(); nothing else from the regset series
> is needed). Killing the old aout helpers is _not_ in this branch;
> followup cleanups live separately.
>
> First we need to sort out the mess with struct elf_prstatus,
> though. It's used both for ELF and ELF_FDPIC coredumps, and it
> contains a couple of fields under ifdef on CONFIG_BINFMT_ELF_FDPIC.
> ELF is MMU-dependent and most, but not all configs that allow ELF_FDPIC
> are non-MMU. ARM is an exception - there ELF_FDPIC is allowed both for
> MMU and non-MMU configs. That's a problem - struct elf_prstatus is a
> part of coredump layout, so ELF coredumps produced by arm kernels that
> have ELF_FDPIC enabled are incompatible with those that have it disabled.
>
> The obvious solution is to introduce struct elf_prstatus_fdpic
> and use that in binfmt_elf_fdpic.c, taking these fields out of the
> normal struct elf_prstatus. Unfortunately, the damn thing is defined in
> include/uapi/linux/elfcore.h, so nominally it's a part of userland ABI.
> However, not a single userland program actually includes linux/elfcore.h.
> The reason is that the definition in there uses elf_gregset_t as a member,
> and _that_ is not defined anywhere in the exported headers. It is defined
> in (libc) sys/procfs.h, but the same file defines struct elf_prstatus
> as well. So if you try to include linux/elfcore.h without having already
> pulled sys/procfs.h, it'll break on incomplete type of a member. And if
> you have pulled sys/procfs.h, it'll break on redefining a structure.
> IOW, it's not usable and it never had been; as the matter of fact,
> that's the reason sys/procfs.h had been introduced back in 1996.
>
> 1/7) unexport linux/elfcore.h
> Takes it out of include/uapi/linux and moves the stuff that used
> to live there into include/linux/elfcore.h
>
> 2/7) take fdpic-related parts of elf_prstatus out
> Now we can take that ifdef out of the definition of elf_prstatus
> (as well as compat_elf_prstatus) and put the variant with those extra
> fields into binfmt_elf_fdpic.c, calling it elf_prstatus_fdpic there.
>
> 3/7) kill elf_fpxregs_t
> All code dealing with it (both in elf_fdpic and non-regset side
> of elf) is conditional upon ELF_CORE_COPY_XFPREGS. And no architectures
> define that anymore. Take the dead code out.
>
> 4/7) [elf-fdpic] coredump: don't bother with cyclic list for per-thread
> objects
> 5/7) [elf-fdpic] move allocation of elf_thread_status into
> elf_dump_thread_status()
> 6/7) [elf-fdpic] use elf_dump_thread_status() for the dumper thread as well
> Massaging fdpic coredump logics towards the regset side of
> elf coredump.
>
> 7/7) [elf-fdpic] switch coredump to regsets
> ... and now we can switch from elf_core_copy_task_{,fp}regs()
> to regset_get().

I just did a quick read through.

The KABI bits look sane, or rather pulling definitions out of the KABI
headers because they are not usable seems like a reasonable response to
a messed up situation. In the long run it would be good if we could get
some proper KABI headers for the format of coredumps.

I am a bit confused about what is happening in the cleanups, and frankly
the fault really lies with the binfmt_elf.c. As binfmt_elf.c in Linus's
tree still has a regset and a non-regset version of core dumping.

What I see happening is that you are transforming what started off
as a copy of the non-regset version of elf coredumping and transforming
it into something close to the regset version of coredumping. Which is
sensible. The fact that the elf_fdpic code continues to use the
non-regset names for the functions it calls, and does not synchronize
it's structure with the ordinary elf core dumping code may be sensible
but it is extremely confusing to follow.

As a follow up it would probably good to sort out synchronize the
elf and elf_fdpic coredumping code as much as possible, just to simplify
future maintenance.

So for as much as I could understand and verify.

Acked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

Eric