Re: user ns: arbitrary module loading

From: Eric W. Biederman
Date: Sat Mar 02 2013 - 23:12:52 EST


"Serge E. Hallyn" <serge@xxxxxxxxxx> writes:

> Quoting Kees Cook (keescook@xxxxxxxxxx):
>> The rearranging done for user ns has resulted in allowing arbitrary
>> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>> by what is assumed to be an unprivileged process.
>>
>> At present, it does look to require at least CAP_SETUID along the way
>> to set up the uidmap (but things like the setuid helper newuidmap
>> might soon start providing such a thing by default).

CAP_SETUID is not needed.

>> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
>> examines module symbols to verify that request_module() for a
>> filesystem only loads a module that defines "register_filesystem"
>> (among other things).
>>
>> -Kees
>>
>> [1] https://twitter.com/grsecurity/status/307473816672665600
>
> So the concern is root in a child user namespace doing
>
> mount -t randomfs <...>
>
> in which case do_new_mount() checks ns_capable(), not capable(),
> before trying to load a module for randomfs.
>
> As well as (secondly) the fact that there is no enforcement on
> the format of the module names (i.e. fs-*).
>
> Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
> At least Eric Paris is strongly against it.

What is wrong with GRKERNSEC_MODHARDEN? It took a quick look and the
code is far from clean. But that would not be a fundamental objection
from keeping code like that out of the kernel.

It is also entertaining to read security code that won't even build with
CONFIG_UIDGID_STRICT_TYPE_CHECKS enabled.

> But how about if we
> add a check for 'current_user_ns() == &init_user_ns' at that place
> instead?
>
> Eric Biederman, do you have any objections to that?

The obvious solution here is to test for CAP_SYS_ADMIN rather than
current_user_ns == &init_user_ns before we request the module here. As
that is what was previously required on this path.

Reading the comments the concerns are.
- Non-root users are allowed to load obscure and possibly kernel
modules.
- get_fs_type can trigger the load of any kernel module.

At a practical level I don't see adding a capalbe(CAP_SYS_ADMIN) check
as having much effect for the functionality currently present in user
namespaces today as the filesystems that an legal to mount in a user
namespace (ramfs, tmpfs, mqueuefs, sysfs, proc, devpts) are so common
most of them can not even be built as modules and even if they are
modules the modules will already be loaded. So I will see about adding
a capable(CAP_SYS_ADMIN) check to shore things up for the short term.

In the longer term I very much would like to get loopback devices
and mounts of filesystems on those loopback devices working, and being
able to mount filesystems from usb sticks that people commonly plug in,
and remove the need for privileged daemons to do that work. At that
point manually having to do something that was automatic before will
either mean a regression in functionality or bugs as people manually
load things.


So I am wondering what I a good policy should be. Should we trust
kernel modules to not be buggy (especially if they were signed as part
of the build process)? Do we add some defense in depth and add
filesystem registration magic? Thinking...

We can limit the request_module in get_fs_type to just filesystems
fairly easily.

In include/linux/fs.h:

#define MODULE_ALIAS_FS(type) MODULE_ALIAS("fs-" __stringify(type))

In fs/filesystems.c:

if (request_moudle("fs-%.*s", len, name) == 0)

Then just add the appropriate MODULE_ALIAS_FS lines in all of the
filesystems. This also allows user space to say set the module loading
policy for filesystems using the blacklist and the alias keywords
in /etc/modprobe.d/*.conf.

That seems a whole lot simpler, more powerful and more maintainable than
what little I saw in GRKERNSEC_MODHARDEN to prevent loading of
non-filesystem modules from get_fs_type.

Eric

p.s. This is the patch I am looking at pushing to Linus in the near
future.

diff --git a/fs/filesystems.c b/fs/filesystems.c
index da165f6..5b0644d 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -273,7 +273,8 @@ struct file_system_type *get_fs_type(const char *name)
int len = dot ? dot - name : strlen(name);

fs = __get_fs_type(name, len);
- if (!fs && (request_module("%.*s", len, name) == 0))
+ if (!fs && capable(CAP_SYS_ADMIN) &&
+ (request_module("%.*s", len, name) == 0))
fs = __get_fs_type(name, len);

if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/