Re: [PATCH v5 3/6] fs: Enable to enforce noexec mounts or file exec through O_MAYEXEC

From: Kees Cook
Date: Wed May 13 2020 - 19:27:45 EST


On Wed, May 13, 2020 at 11:37:16AM -0400, Stephen Smalley wrote:
> On Tue, May 5, 2020 at 11:33 AM Mickaël Salaün <mic@xxxxxxxxxxx> wrote:
> >
> > Enable to forbid access to files open with O_MAYEXEC. Thanks to the
> > noexec option from the underlying VFS mount, or to the file execute
> > permission, userspace can enforce these execution policies. This may
> > allow script interpreters to check execution permission before reading
> > commands from a file, or dynamic linkers to allow shared object loading.
> >
> > Add a new sysctl fs.open_mayexec_enforce to enable system administrators
> > to enforce two complementary security policies according to the
> > installed system: enforce the noexec mount option, and enforce
> > executable file permission. Indeed, because of compatibility with
> > installed systems, only system administrators are able to check that
> > this new enforcement is in line with the system mount points and file
> > permissions. A following patch adds documentation.
> >
> > For tailored Linux distributions, it is possible to enforce such
> > restriction at build time thanks to the CONFIG_OMAYEXEC_STATIC option.
> > The policy can then be configured with CONFIG_OMAYEXEC_ENFORCE_MOUNT and
> > CONFIG_OMAYEXEC_ENFORCE_FILE.
> >
> > Being able to restrict execution also enables to protect the kernel by
> > restricting arbitrary syscalls that an attacker could perform with a
> > crafted binary or certain script languages. It also improves multilevel
> > isolation by reducing the ability of an attacker to use side channels
> > with specific code. These restrictions can natively be enforced for ELF
> > binaries (with the noexec mount option) but require this kernel
> > extension to properly handle scripts (e.g., Python, Perl). To get a
> > consistent execution policy, additional memory restrictions should also
> > be enforced (e.g. thanks to SELinux).
> >
> > Signed-off-by: Mickaël Salaün <mic@xxxxxxxxxxx>
> > Reviewed-by: Thibaut Sautereau <thibaut.sautereau@xxxxxxxxxxx>
> > Cc: Aleksa Sarai <cyphar@xxxxxxxxxx>
> > Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> > Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> > ---
>
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 33b6d372e74a..70f179f6bc6c 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -411,10 +412,90 @@ static int sb_permission(struct super_block *sb, struct inode *inode, int mask)
> <snip>
> > +#if defined(CONFIG_SYSCTL) && !defined(CONFIG_OMAYEXEC_STATIC)
> > +int proc_omayexec(struct ctl_table *table, int write, void __user *buffer,
> > + size_t *lenp, loff_t *ppos)
> > +{
> > + int error;
> > +
> > + if (write) {
> > + struct ctl_table table_copy;
> > + int tmp_mayexec_enforce;
> > +
> > + if (!capable(CAP_MAC_ADMIN))
> > + return -EPERM;
>
> Not fond of using CAP_MAC_ADMIN here (or elsewhere outside of security
> modules). The ability to set this sysctl is not equivalent to being
> able to load a MAC policy, set arbitrary MAC labels on
> processes/files, etc.

That's fair. In that case, perhaps this could just use the existing
_sysadmin helper? (Though I should note that these perm checks actually
need to be in the open, not the read/write ... I thought there was a
series to fix that, but I can't find it now. Regardless, that's
orthogonal to this series.)

> > + * omayexec_inode_permission - Check O_MAYEXEC before accessing an inode
> > + *
> > + * @inode: Inode to check permission on
> > + * @mask: Right to check for (%MAY_OPENEXEC, %MAY_EXECMOUNT, %MAY_EXEC)
> > + *
> > + * Returns 0 if access is permitted, -EACCES otherwise.
> > + */
> > +static inline int omayexec_inode_permission(struct inode *inode, int mask)
> > +{
> > + if (!(mask & MAY_OPENEXEC))
> > + return 0;
> > +
> > + if ((sysctl_omayexec_enforce & OMAYEXEC_ENFORCE_MOUNT) &&
> > + !(mask & MAY_EXECMOUNT))
> > + return -EACCES;
> > +
> > + if (sysctl_omayexec_enforce & OMAYEXEC_ENFORCE_FILE)
> > + return generic_permission(inode, MAY_EXEC);
> > +
> > + return 0;
> > +}
>
> I'm wondering if this is being done at the wrong level. I would think
> that OMAYEXEC_ENFORCE_FILE would mean to check file execute permission
> with respect to all mechanisms/policies, including DAC,
> filesystem-specific checking (inode->i_op->permission), security
> modules, etc. That requires more than just calling
> generic_permission() with MAY_EXEC, which only covers the default
> DAC/ACL logic; you'd need to take the handling up a level to
> inode_permission() and re-map MAY_OPENEXEC to MAY_EXEC for
> do_inode_permission() and security_inode_permission() at least.

Oh, yeah, that's a good point. Does this need to be a two-pass check, or
can MAY_OPENEXEC get expanded to MAY_EXEC here? Actually, why is this so
deep at all? Shouldn't this be in may_open()?

Like, couldn't just the entire thing just be:

diff --git a/fs/namei.c b/fs/namei.c
index a320371899cf..0ab18e19f5da 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2849,6 +2849,13 @@ static int may_open(const struct path *path, int acc_mode, int flag)
break;
}

+ if (unlikely(mask & MAY_OPENEXEC)) {
+ if (sysctl_omayexec_enforce & OMAYEXEC_ENFORCE_MOUNT &&
+ path_noexec(path))
+ return -EACCES;
+ if (sysctl_omayexec_enforce & OMAYEXEC_ENFORCE_FILE)
+ acc_mode |= MAY_EXEC;
+ }
error = inode_permission(inode, MAY_OPEN | acc_mode);
if (error)
return error;

> Alternatively, we can modify each individual filesystem (that
> implements its own i_op->permission) and security module to start
> handling MAY_OPENEXEC and have them choose to remap it to a file
> execute check (or not) independent of the sysctl. Not sure of your

Eek, no, this should be centralized in the VFS, not per-filesystem, but
I do see that it might be possible for a filesystem to actually do the
MAY_OPENEXEC test internally, so the two-pass check wouldn't be needed.
But... I think that can't happen until _everything_ can do the single
pass check, so we always have to make the second call too.

> intent. As it stands, selinux_inode_permission() will ignore the new
> MAY_OPENEXEC flag until someone updates it. Likewise for Smack.
> AppArmor/TOMOYO would probably need to check and handle FMODE_EXEC in
> their file_open hooks since they don't implement inode_permission().

Is there any need to teach anything about MAY_OPENEXEC? It'll show up
for the LSMs as (MAY_OPEN | MAY_EXEC).

--
Kees Cook