Re: [RFC] exec: add a flag for "reasonable" execveat() comm
From: Kees Cook
Date: Mon Oct 14 2024 - 17:13:42 EST
On Wed, Oct 09, 2024 at 08:41:31AM -0600, Tycho Andersen wrote:
> On Wed, Oct 02, 2024 at 02:34:43PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
> > On Tue, Sep 24, 2024 at 12:39:35PM -0500, Eric W. Biederman wrote:
> > > Tycho Andersen <tycho@tycho.pizza> writes:
> > >
> > > > From: Tycho Andersen <tandersen@xxxxxxxxxxx>
> > > >
> > > > Zbigniew mentioned at Linux Plumber's that systemd is interested in
> > > > switching to execveat() for service execution, but can't, because the
> > > > contents of /proc/pid/comm are the file descriptor which was used,
> > > > instead of the path to the binary. This makes the output of tools like
> > > > top and ps useless, especially in a world where most fds are opened
> > > > CLOEXEC so the number is truly meaningless.
> > > >
> > > > This patch adds an AT_ flag to fix up /proc/pid/comm to instead be the
> > > > contents of argv[0], instead of the fdno.
> >
> > I tried this version (with a local modification to drop the flag and
> > enable the new codepath if get_user_arg_ptr(argv, 0) returns nonnull
> > as suggested later in the thread), and it seems to work as expected.
> > In particular, 'pgrep' finds for the original name in case of
> > symlinks.
>
> Here is a version that only affects /proc/pid/comm, without a flag. We
> still have to do the dance of keeping the user argv0 before actually
> doing __set_task_comm(), since we want to surface the resulting fault
> if people pass a bad argv0. Thoughts?
>
> Tycho
>
>
>
> diff --git a/fs/exec.c b/fs/exec.c
> index dad402d55681..61de8a71f316 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1416,7 +1416,16 @@ int begin_new_exec(struct linux_binprm * bprm)
> set_dumpable(current->mm, SUID_DUMP_USER);
>
> perf_event_exec();
> - __set_task_comm(me, kbasename(bprm->filename), true);
> +
> + /*
> + * If fdpath was set, execveat() made up a path that will
> + * probably not be useful to admins running ps or similar.
> + * Let's fix it up to be something reasonable.
> + */
> + if (bprm->argv0)
> + __set_task_comm(me, kbasename(bprm->argv0), true);
> + else
> + __set_task_comm(me, kbasename(bprm->filename), true);
This isn't checking fdpath?
>
> /* An exec changes our domain. We are no longer part of the thread
> group */
> @@ -1566,9 +1575,30 @@ static void free_bprm(struct linux_binprm *bprm)
> if (bprm->interp != bprm->filename)
> kfree(bprm->interp);
> kfree(bprm->fdpath);
> + kfree(bprm->argv0);
> kfree(bprm);
> }
>
> +static int bprm_add_fixup_comm(struct linux_binprm *bprm, struct user_arg_ptr argv)
> +{
> + const char __user *p = get_user_arg_ptr(argv, 0);
> +
> + /*
> + * In keeping with the logic in do_execveat_common(), we say p == NULL
> + * => "" for comm.
> + */
> + if (!p) {
> + bprm->argv0 = kstrdup("", GFP_KERNEL);
> + return 0;
> + }
> +
> + bprm->argv0 = strndup_user(p, MAX_ARG_STRLEN);
> + if (bprm->argv0)
> + return 0;
> +
> + return -EFAULT;
> +}
I'd rather this logic got done in copy_strings() and to avoid duplicating
a copy for all exec users. I think it should be possible to just do
this, to find the __user char *:
diff --git a/fs/exec.c b/fs/exec.c
index 77364806b48d..e12fd706f577 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -642,6 +642,8 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
goto out;
}
}
+ if (argc == 0)
+ bprm->argv0 = str;
}
ret = 0;
out:
Once we get to begin_new_exec(), only if we need to do the work (fdpath
set), then we can do the strndup_user() instead of making every exec
hold a copy regardless of whether it will be needed.
-Kees
> +
> static struct linux_binprm *alloc_bprm(int fd, struct filename *filename, int flags)
> {
> struct linux_binprm *bprm;
> @@ -1975,6 +2005,10 @@ static int do_execveat_common(int fd, struct filename *filename,
> goto out_ret;
> }
>
> + retval = bprm_add_fixup_comm(bprm, argv);
> + if (retval != 0)
> + goto out_free;
> +
> retval = count(argv, MAX_ARG_STRINGS);
> if (retval == 0)
> pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n",
> diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
> index e6c00e860951..0cd1f2d0e8c6 100644
> --- a/include/linux/binfmts.h
> +++ b/include/linux/binfmts.h
> @@ -55,6 +55,7 @@ struct linux_binprm {
> of the time same as filename, but could be
> different for binfmt_{misc,script} */
> const char *fdpath; /* generated filename for execveat */
> + const char *argv0; /* argv0 from execveat */
> unsigned interp_flags;
> int execfd; /* File descriptor of the executable */
> unsigned long loader, exec;
--
Kees Cook