Re: [GIT PULL] Add trusted_for(2) (was O_MAYEXEC)

From: Linus Torvalds
Date: Mon Apr 04 2022 - 18:12:19 EST

On Mon, Apr 4, 2022 at 1:29 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote:
> This initial proposal was using a new faccessat2(2) flag:
> What do you think about that? I'm happy to get back to this version if
> everyone is OK with it.

I'm certainly happi_er_ with that, but I find that particular patch
odd for other reasons.

In no particular order:

- what's with the insane non-C syntax? Double parentheses have no
actual meaning in C:

if ((flags & AT_INTERPRETED)) {
if ((mode & MAY_EXEC)) {

so I don't understand why you'd use that strance thing.

- why is this an AT_INTERPRETED flag? I don't understand the name, I
don't understand the semantics.

Why would that flag have the same value as AT_SYMLINK_FOLLOW?

Why isn't this just a new _mode_ bit, which is what I feel is
sensible? We only use three bits (with no bits set meaning
"existence"), so we have *tons* of bits left in that namespace, and it
would make much more sense to me to have

#define EXECVE_OK 8

which is the same as the "group exec" bit, so it actually makes
some kind of sense too.

- related to that "I don't understand the semantics", the
"documentation" for that thing doesn't make sense either:

+ The
+ main usage is for script
+ interpreters to enforce a policy
+ consistent with the kernel's one
+ (through sysctl configuration or LSM
+ policy). */

Now, what I *think* you mean is

(1) user-space executable loaders want to be able to test the *same*
policy as the kernel does for execve()

(2) access(path, EXECVE_OK) will do the same permission checks as
"execve()" would do for that path

(3) if you already have the fd open, use "faccess(fd, NULL,

(4) maybe we want to add a flag for the "euid vs real uid", and that
would be in the "flags" field, since that changes the actual *lookup*

Note that that (4) is something that some normal user space has wanted
in the past too (GNU libcs has a "eaccess()" thing for "effective uid

And yes, we still need to talk details:

- no backwards compatibility issues, because we've happily always
checked 'mode' for being valid, so old kernels will always return

- POSIX namespace issues for EXECVE_OK means that the name probably
needs some thinking (maybe we just need to call it __ACCESS_OK_EXECVE
internally or something - the kernel actually doesn't even export the
existing [FRWX]_OK values, because we "know" they map tho the usual
"owner RWX" bits, with F being "no bit set")

- I really want the exact semantics very clearly defined. I think
it's ok to say "exact same security check as for 'execve()'", but even
then we need to have that discussion about

(a) "what about suid bits that user space cannot react to"
(b) that whole "effective vs real" discussion

So to recap: I'm very much ok with some access() extension, but I
think even that very much needs clarification, and the existing patch
is just odd in many many ways.