Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters

From: Aleksa Sarai
Date: Sun May 12 2019 - 09:38:05 EST

On 2019-05-12, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Sat, May 11, 2019 at 7:37 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > I bet this will break something that already exists. An execveat()
> > flag to turn off /proc/self/exe would do the trick, though.
> Thinking more about it, I suspect it is (once again) wrong to let the
> thing that does the execve() control that bit.
> Generally, the less we allow people to affect the lifetime and
> environment of a suid executable, the better off we are.
> But maybe we could limit /proc/*/exe to at least not honor suid'ness
> of the target? Or does chrome/runc depend on that too?

Speaking on the runc side, we don't depend on this. It's possible
someone depends on this for fexecve(3) -- but as mentioned before in
newer kernels glibc uses execve(AT_EMPTY_PATH).

I would like to point out though that I'm a little bit cautious about
/proc/self/exe-specific restrictions -- because a trivial way to get
around them would be to just open it with O_PATH (and you end up with a
/proc/self/fd/ which is equivalent). Unfortunately blocking setuid exec
on all O_PATH descriptors would break even execve(AT_EMPTY_PATH) of
setuid descriptors.

The patches I mentioned (which Andy and I discussed off-list) would
effectively make the magiclink modes in /proc/ affect how you can
operate on the path (no write bit in the mode, cannot re-open it write).
One aspect of this is how to handle O_PATH and in particular how do we
handle an O_PATH re-open of an already-restricted magiclink.

Maybe we could make it so that setuid is disallowed if you are dealing
with an O_PATH fd which was a magiclink. Effectively, on O_PATH open you
get an fmode_t saying FMODE_SETUID_EXEC_ALLOWED *but* if the path is a
magiclink this fmode gets dropped and when the fd is given to
execveat(AT_EMPTY_PATH) the fmode is checked and setuid-exec is not

[I assume in this discussion "setuid" means "setuid + setcap", right?]

Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH

