Re: [RFC][PATCH] exec: Use init rlimits for setuid exec

From: Linus Torvalds
Date: Thu Jul 06 2017 - 12:34:21 EST

On Wed, Jul 5, 2017 at 9:32 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> In an attempt to provide sensible rlimit defaults for setuid execs, this
> inherits the namespace's init rlimits:

Yeah, so I have to admit to hating this patch.

As already mentioned by others, it's not only not clear that we want
to do this on every setuid exec, it's also not clear that init is the
right source of limits, or even which limits we'd want to copy.

I can easily see init doing a rlimit for its own use, and then when it
goes through the fork/exec process does it set up some other rlimit
for what it is going to run. You'd presumably want that for any
non-system thing, so it's actually fairly natural to do it for system
things too, so it's not at all obvious that "init" itself would run
with some generic "system limits".

So to me this feels like a bad hack that was brought on by this
particular attack.

I'd much rather see something like

(a) minimal: just use our existing default stack (and stack _only_)
limit value for suid binaries that actually get extra permissions: {


(b) fancier: per-namespace defaults that can be explicitly set by
something, and enabled individually.


(c) perhaps encourage people to annotate their suid binaries with
initial resource requirements (and for stack, I mean the existing
GNU_STACK ELF annotation in particular).

For an example of (a), that existing _STK_LIM define is what the
kernel defaults to, and it's a 8MB stack. And looking at my Fedora
install, I see that the default user rlimit is 8MB for the stack.

Is that just coincidence, or is that just a sign of "nobody ever even
modifies the default value"? So (a) feels like "nobody really cares,
and 8MB is fine, and nobody even bothers changing it - just do the
minimal thing".

As to (b), we could just have that whole INIT_RLIMITS per-namespace,
but only enable the stack limit by default. But then system admins
could cvhange those limits and enable/disable individual rlimits to be
used by suid binaries. That feels like the "give the admin tools"

And (c) would be the sane option, and what we already do for things
like GNU_STACK to enable/disable executable stacks. It really feels
like allowing the GNU_STACK segment to contain stack rlimit override
information would be the perfect tool for binaries to say "Yeah, I
need more stack than _STK_LIM".

So I see many different approaches (that could be combined: I like
combining (a) and (c), for example), and absolutely none of them
involve the random "take some values from init".

And yes, a large part of this may be that I no longer feel like I can
trust "init" to do the sane thing. You all presumably know why.