Re: Regression: commit da029c11e6b1 broke toybox xargs.
From: Kees Cook
Date: Wed Nov 01 2017 - 23:30:37 EST
On Wed, Nov 1, 2017 at 4:34 PM, Rob Landley <rob@xxxxxxxxxxx> wrote:
> Toybox has been trying to figure out how big an xargs is allowed to be
> for a while:
>
> http://lists.landley.net/pipermail/toybox-landley.net/2017-October/009186.html
>
> We're trying to avoid the case where you can run something from the
> command line, but not through xargs. In theory this limit is
> sysconf(_SC_ARG_MAX) which on bionic and glibc returns 1/4 RLIMIT_STACK
> (in accordance with <strike>the prophecy</strike> fs/exec.c function
> get_arg_page()), but that turns out to be too simple. There's also a
> 131071 byte limit on each _individual_ argument, which I think I've
> tracked down to fs/exec.c function setup_arg_pages() doing:
>
> stack_expand = 131072UL; /* randomly 32*4k (or 2*64k) pages *
>
> And then it worked under ubuntu 14.04 but not current kernels. Why?
> Because the above commit from Kees Cook broke it, by taking this:
>
> include/uapi/linux/resource.h:
> /*
> * Limit the stack by to some sane default: root can always
> * increase this limit if needed.. 8MB seems reasonable.
> */
> #define _STK_LIM (8*1024*1024)
>
> And hardwiring in a random adjustment as a "640k ought to be enough for
> anybody" constant on TOP of the existing RLIMIT_STACK/4 check. Without
> even adjusting the "oh of course root can make this bigger, this is just
> a default value" comment where it's #defined.
>
> Look, if you want to cap RLIMIT_STACK for suid binaries, go for it. The
> existing code will notice and adapt. But this new commit is crazy and
> arbitrary and introduces more random version dependencies (how is
> sysconf() supposed to know the value, an #if/else staircase based on
> kernel version in every libc)?
>
> Please revert it,
Hi Linus,
This is a report of userspace breakage due to:
commit da029c11e6b1 ("exec: Limit arg stack to at most 75% of _STK_LIM")
As a reminder to earlier discussions[1], it had been suggested that
this be setuid only, but you had asked that this be globally applied:
On Mon, Jul 10, 2017 at 11:24 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> But honestly, a security limit that isn't tested in normal working is
> not a security limit at all, it's just theory and likely bullshit. So
> I'd much rather *not* make it suid-specific if at all possible. That
> way it has some chance in hell of actually getting tested.
We're going to need to revisit this. One alternative that was
suggested by Andy was to do a late "how much stack space was used?"
check after arg processing was finished. This could be attached to a
secureexec test to limit the checks for pathological conditions only
to setuid processes.
Rob, thanks for the report! Can you confirm that reverting the above
commit fixes the problem? There is also
commit 98da7d08850f ("fs/exec.c: account for argv/envp pointers")
which changes the calculation slightly too. If _SC_ARG_MAX is
hardcoded in bionic and glibc as 1/4 RLIMIT_STACK, we may need to
adjust this commit as well, since it will be a problem for giant
argument lists of very short strings.
Thanks,
-Kees
[1] https://lkml.org/lkml/2017/7/10/633
--
Kees Cook
Pixel Security