Re: plan9 semantics on Linux - mount namespaces

From: Eric W. Biederman
Date: Fri Feb 16 2018 - 13:27:28 EST


Enrico Weigelt <lkml@xxxxxxxxx> writes:

> On 13.02.2018 22:12, Enrico Weigelt wrote:
>
> CC @containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
>
>> Hi folks,
>>
>>
>> I'm currently trying to implement plan9 semantics on Linux and
>> yet sorting out how to do the mount namespace handling.
>>
>> On plan9, any unprivileged process can create its own namespace
>> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
>>
>> What is the reason for not allowing arbitrary users to create their
>> own private mount namespace ? What could go wrong here ?

suid root executables could be fooled. An easy case is fooling
/bin/su into reading a different copy of /etc/shadow, and allowing
arbitrary changes between users.

>> IMHO, we could allow mount/bind under the following conditions:
>>
>> * the process is in a private mount namespace
>> * no suid-flag is honored (either force all mounts to nosuid or
>> Â completely mask it out)
>> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
>>
>> Maybe that all could be enabled by a new capability.
>>
>>
>> any suggestions ?

User namespaces limit the contained processes to not having any
permissions outside of the user namespace. While still allowing the
fully unix permission model inside user namespaces.

I am in the final stages of getting the changes in the vfs and in fuse
to allow unprivileged users to mount that filesystem. plan9fs would
also be a candidate for that kind of treatment if it had a maintainer.

Eric