Re: [RFC][PATCH] exec: Use init rlimits for setuid exec

From: Eric W. Biederman
Date: Thu Jul 06 2017 - 08:46:28 EST


Kees Cook <keescook@xxxxxxxxxxxx> writes:

> In an attempt to provide sensible rlimit defaults for setuid execs, this
> inherits the namespace's init rlimits:
>
> $ ulimit -s
> 8192
> $ ulimit -s unlimited
> $ /bin/sh -c 'ulimit -s'
> unlimited
> $ sudo /bin/sh -c 'ulimit -s'
> 8192
>
> This is modified from Brad Spengler/PaX Team's hard-coded setuid exec
> stack rlimit (8MB) in the last public patch of grsecurity/PaX based on
> my understanding of the code. Changes or omissions from the original
> code are mine and don't reflect the original grsecurity/PaX code.
>
> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> ---
> Instead of copying all rlimits, we could also pick specific ones to copy
> (e.g. RLIMIT_STACK, or ones from Andy's list) or exclude from copying
> (probably better to blacklist than whitelist).
>
> I think this is the right way to find the ns init task, but maybe it
> needs locking?
> ---
> fs/exec.c | 34 ++++++++++++++++++++++++++++++----
> 1 file changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 904199086490..80e8b2bd4284 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1675,6 +1675,12 @@ static int exec_binprm(struct linux_binprm *bprm)
> return ret;
> }
>
> +static inline bool is_setuid_exec(struct linux_binprm *bprm)
> +{
> + return (!uid_eq(bprm->cred->euid, current_euid()) ||
> + !gid_eq(bprm->cred->egid, current_egid()));
> +}

Awesome I can make an executable setuid to myself and get all of roots
rlimits!

Scratch inheritable rlimits as useful for any kind of policy decision.

> /*
> * sys_execve() executes a new program.
> */
> @@ -1687,6 +1693,7 @@ static int do_execveat_common(int fd, struct filename *filename,
> struct linux_binprm *bprm;
> struct file *file;
> struct files_struct *displaced;
> + struct rlimit saved_rlim[RLIM_NLIMITS];
> int retval;
>
> if (IS_ERR(filename))
> @@ -1771,24 +1778,38 @@ static int do_execveat_common(int fd, struct filename *filename,
> if (retval < 0)
> goto out;
>
> + /*
> + * From here forward, we've got credentials set up and we're
> + * using resources, so do rlimit replacement before we start
> + * copying strings. (Note that the RLIMIT_NPROC check has
> + * already happened.)
> + */
> + BUILD_BUG_ON(sizeof(saved_rlim) != sizeof(current->signal->rlim));
> + if (is_setuid_exec(bprm)) {
> + memcpy(saved_rlim, current->signal->rlim, sizeof(saved_rlim));
> + memcpy(current->signal->rlim,
> + task_active_pid_ns(current)->child_reaper->signal->rlim,
> + sizeof(current->signal->rlim));
> + }
> +

Caerful. child_reaper can change if you are not holding the tasklist
lock.

It would be better if we could move any rlimit changes after de_thread.
Otherwise there are some really fun races you can play with.

After de_thread is past the point of no return so you would not need to
worry about restoring the rlimits either.

> retval = copy_strings_kernel(1, &bprm->filename, bprm);
> if (retval < 0)
> - goto out;
> + goto out_restore;
>
> bprm->exec = bprm->p;
> retval = copy_strings(bprm->envc, envp, bprm);
> if (retval < 0)
> - goto out;
> + goto out_restore;
>
> retval = copy_strings(bprm->argc, argv, bprm);
> if (retval < 0)
> - goto out;
> + goto out_restore;
>
> would_dump(bprm, bprm->file);
>
> retval = exec_binprm(bprm);
> if (retval < 0)
> - goto out;
> + goto out_restore;
>
> /* execve succeeded */
> current->fs->in_exec = 0;
> @@ -1802,6 +1823,11 @@ static int do_execveat_common(int fd, struct filename *filename,
> put_files_struct(displaced);
> return retval;
>
> +out_restore:
> + if (is_setuid_exec(bprm)) {
> + memcpy(current->signal->rlim, saved_rlim, sizeof(saved_rlim));
> + }
> +
> out:
> if (bprm->mm) {
> acct_arg_size(bprm, 0);
> --
> 2.7.4

Eric