Re: [RFC PATCH] fs/coredump: Enable dynamic configuration of max file note size

From: Luis Chamberlain
Date: Mon Apr 29 2024 - 14:38:54 EST


On Mon, Apr 29, 2024 at 05:21:28PM +0000, Allen Pais wrote:
> Introduce the capability to dynamically configure the maximum file
> note size for ELF core dumps via sysctl. This enhancement removes
> the previous static limit of 4MB, allowing system administrators to
> adjust the size based on system-specific requirements or constraints.
>
> - Remove hardcoded `MAX_FILE_NOTE_SIZE` from `fs/binfmt_elf.c`.
> - Define `max_file_note_size` in `fs/coredump.c` with an initial value set to 4MB.
> - Declare `max_file_note_size` as an external variable in `include/linux/coredump.h`.
> - Add a new sysctl entry in `kernel/sysctl.c` to manage this setting at runtime.
>
> $ sysctl -a | grep max_file_note_size
> kernel.max_file_note_size = 4194304
>
> $ sysctl -n kernel.max_file_note_size
> 4194304
>
> $echo 519304 > /proc/sys/kernel/max_file_note_size
>
> $sysctl -n kernel.max_file_note_size
> 519304

This doesn't highlight anything about *why*. So in practice you must've
hit a use case where ELF notes are huge, can you give an example of
that? The commit should also describe that this is only used in the path
of a coredump on ELF binaries via elf_core_dump().

More below.

> Signed-off-by: Vijay Nag <nagvijay@xxxxxxxxxxxxx>
> Signed-off-by: Allen Pais <apais@xxxxxxxxxxxxxxxxxxx>
> ---
> fs/binfmt_elf.c | 3 +--
> fs/coredump.c | 3 +++
> include/linux/coredump.h | 1 +
> kernel/sysctl.c | 8 ++++++++
> 4 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 5397b552fbeb..5fc7baa9ebf2 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -1564,7 +1564,6 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
> fill_note(note, "CORE", NT_SIGINFO, sizeof(*csigdata), csigdata);
> }
>
> -#define MAX_FILE_NOTE_SIZE (4*1024*1024)
> /*
> * Format of NT_FILE note:
> *
> @@ -1592,7 +1591,7 @@ static int fill_files_note(struct memelfnote *note, struct coredump_params *cprm
>
> names_ofs = (2 + 3 * count) * sizeof(data[0]);
> alloc:
> - if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
> + if (size >= max_file_note_size) /* paranoia check */
> return -EINVAL;
> size = round_up(size, PAGE_SIZE);
> /*
> diff --git a/fs/coredump.c b/fs/coredump.c
> index be6403b4b14b..a83c6cc893fc 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -56,10 +56,13 @@
> static bool dump_vma_snapshot(struct coredump_params *cprm);
> static void free_vma_snapshot(struct coredump_params *cprm);
>
> +#define MAX_FILE_NOTE_SIZE (4*1024*1024)
> +
> static int core_uses_pid;
> static unsigned int core_pipe_limit;
> static char core_pattern[CORENAME_MAX_SIZE] = "core";
> static int core_name_size = CORENAME_MAX_SIZE;
> +unsigned int max_file_note_size = MAX_FILE_NOTE_SIZE;
>
> struct core_name {
> char *corename;
> diff --git a/include/linux/coredump.h b/include/linux/coredump.h
> index d3eba4360150..e1ae7ab33d76 100644
> --- a/include/linux/coredump.h
> +++ b/include/linux/coredump.h
> @@ -46,6 +46,7 @@ static inline void do_coredump(const kernel_siginfo_t *siginfo) {}
> #endif
>
> #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL)
> +extern unsigned int max_file_note_size;
> extern void validate_coredump_safety(void);
> #else
> static inline void validate_coredump_safety(void) {}
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 81cc974913bb..80cdc37f2fa2 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -63,6 +63,7 @@
> #include <linux/mount.h>
> #include <linux/userfaultfd_k.h>
> #include <linux/pid.h>
> +#include <linux/coredump.h>
>
> #include "../lib/kstrtox.h"
>
> @@ -1623,6 +1624,13 @@ static struct ctl_table kern_table[] = {
> .mode = 0644,
> .proc_handler = proc_dointvec,
> },
> + {
> + .procname = "max_file_note_size",
> + .data = &max_file_note_size,
> + .maxlen = sizeof(unsigned int),
> + .mode = 0644,
> + .proc_handler = proc_dointvec,
> + },
> #ifdef CONFIG_PROC_SYSCTL

No, please move this to coredump_sysctls in fs/coredump.c. And there is
no point in supporting int, this is unisgned int right? So use the right
proc handler for it.

If we're gonna do this, it makes sense to document the ELF note binary
limiations. Then, consider a defense too, what if a specially crafted
binary with a huge elf note are core dumped many times, what then?
Lifting to 4 MiB puts in a situation where abuse can lead to many silly
insane kvmalloc()s. Is that what we want? Why?

Luis