Re: [PATCH 3/5] exec: Remove recursion from search_binary_handler
From: Eric W. Biederman
Date: Thu May 14 2020 - 12:52:59 EST
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
> On Tue, May 12, 2020 at 11:46 AM Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>>
>> I am still thinking about this one, but here is where I am at. At a
>> practical level passing the file descriptor of the script to interpreter
>> seems like something we should encourage in the long term. It removes
>> races and it is cheaper because then the interpreter does not have to
>> turn around and open the script itself.
>
> Yeah, I think we should continue to support it, because I think it's
> the right thing to do (and we might just end up having compatibility
> issues if we don't).
>
> How about trying to move the logic to the common code, out of binfmt_misc?
>
> IOW, how about something very similar to your "brpm->preserve_creds"
> thing that you did for the credentials (also for binfmt_misc, which
> shouldn't surprise anybody: binfmt_misc is simply the "this is the
> generic thing for letting user mode do the final details").
>
>> Calling fd_install in binfmt_misc still seems wrong, as that exposes
>> the new file descriptor to user space with the old creds.
>
> Right. And it really would be good to simply not have these kinds of
> very special cases inside the low-level binfmt code: I'd much rather
> have the special cases in the generic code, so that we see what the
> ordering is etc. One of the big problems with all these binfmt
> callbacks has been the fact that it makes it so hard to think about
> and change the generic code, because the low-level binfmt handlers all
> do their own special thing.
>
> So moving it to generic code would likely simplify things from that
> angle, even if the actual complexity of the feature itself remains.
>
> Besides, we really have exposed this to other code anyway thanks to
> that whole bprm->interp_data thing, and the AT_EXECFD AUX entries that
> we have. So it's not really "internal" to binfmt_misc _anyway_.
>
> So how about we just move the fd_binary logic to the generic execve
> code, and just binfmt_misc set the flag for "yes, please do this",
> exactly like "preserve_creds"?
>
>> It is possible although unlikely for userspace to find the file
>> descriptor without consulting AT_EXECFD so just to be conservative I
>> think we should install the file descriptor in begin_new_exec even if
>> the next interpreter does not support AT_EXECFD.
>
> Ack. I think the AT_EXECFD thing is a sign that this isn't internal to
> binfmt_misc, but it also shouldn't be gating this issue. In reality,
> ELF is the only real binary format that matters - the script/misc
> binfmts are just indirection entries - and it supports AT_EXECFD, so
> let's just ignore the theoretical case of "maybe nobody exposes it".
>
> So yes, just make it part of begin_new_exec(), and there's no reason
> to support more than a single fd. No stacks or arrays of these things
> required, I feel. It's not like AT_EXECFD supports the notion of
> multiple fd's being reported anyway, nor does it make any sense to
> have some kind of nested misc->misc binfmt nesting.
>
> So making that whole interp_data and fd_binary thing be a generic
> layer thing would make the search_binary_handler() code in binfmt_misc
> be a pure tailcall too, and then the conversion to a loop ends up
> working and being the right thing.
That is pretty much what I have been thinking. I have just been taking
it slow so I find as many funny corner cases as I can.
Nothing ever clears the BINPRM_FLAGS_EXECFD so the current code can
not support nesting.
Now I do think a nested misc->misc binfmt thing can make sense in
principal. I have an old dos spectrum emulator that I use to play some
of the games that I grew up with. Running that emulator makes me two
emulators deep. I can also imagine writting a domain specific language
in python or perl, and setting things up so scripts in the domain
specific language can be run directly.
So I think I need to deliberately test and prevent a nested misc->misc,
just so data structures don't get stomped. If the cases where it could
useful prove sufficiently interesting we can enable them later.
Eric