Re: [RFC][PATCH] exec: Use init rlimits for setuid exec

From: Kees Cook
Date: Thu Jul 06 2017 - 15:13:03 EST

On Thu, Jul 6, 2017 at 10:52 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Jul 6, 2017 at 10:29 AM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>> (a) minimal: just use our existing default stack (and stack _only_)
>>> limit value for suid binaries that actually get extra permissions: {
>> This would look a lot like the existing patch; it'd just not copy the
>> init process rlimits.
> Can't we just do the final rlimit setting so late in execve that we
> don't need that whole "saved_rlimit" thing?

The stack rlimit defines the mmap layout too:

do_execveat_common() ->
exec_binprm() ->
search_binary_handler() ->
fmt->load_binary (load_elf_binary()) ->
setup_new_exec() ->
arch_pick_mmap_layout() ->
mmap_is_legacy() ->

exec_binprm() happens after the other stack setup (copy_strings()), so
if we wanted to avoid saved_rlimit, we'd have to replumb how
arch_pick_mmap_layout() works and how copy_strings() performs its
calculations (neither looks too terrible).

> If the issue is the "people can use argv/envp to already fill the
> stack", then I'd actually be happier with just limiting that.
> We already claim that our ARG_MAX is just 128kB (old legacy). And I
> was really happy when we changed our execve() to not have that nasty
> array of pages, and we could expand on the array sizes. But we could
> *easily* just say "limit execve arrays to 8MB", because while our code
> can handle more, you do have latency issues and just memory use issues
> too.

That would address the argv/envp calculation but not the layout control.

> So right now we already limit the stack size artificially to 1/4 the
> stack rlimit (see get_arg_page()), and we could easily just further
> cap it at 8M total - right now people obviously actually run in
> practice with much less (ie for me that argument size is capped at a
> quarter of that 8MB default rlimit).
> I have heard of people who want a big stack due to crazy recursion or
> due to just doing otherwise insane things. But needing more than 8MB
> of arg/envp? Not happening.
> So I think we could easily do that stack rlimit thing at the very last
> minute, and not have to worry about restoring anything.

We should double check there isn't more than just argv and layout, but
I think both cases could be passed down a value from above instead of
examining current's rlimits.


Kees Cook
Pixel Security