Re: [PATCH v5 00/11] FUSE mounts from non-init user namespaces

From: Dongsu Park
Date: Tue Jan 09 2018 - 10:05:48 EST


Hi,

On Mon, Dec 25, 2017 at 8:05 AM, Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
> Dongsu Park <dongsu@xxxxxxxxxx> writes:
>
>> This patchset v5 is based on work by Seth Forshee and Eric Biederman.
>> The latest patchset was v4:
>> https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1132206.html
>>
>> At the moment, filesystems backed by physical medium can only be mounted
>> by real root in the initial user namespace. This restriction exists
>> because if it's allowed for root user in non-init user namespaces to
>> mount the filesystem, then it effectively allows the user to control the
>> underlying source of the filesystem. In case of FUSE, the source would
>> mean any underlying device.
>>
>> However, in many use cases such as containers, it's necessary to allow
>> filesystems to be mounted from non-init user namespaces. Goal of this
>> patchset is to allow FUSE filesystems to be mounted from non-init user
>> namespaces. Support for other filesystems like ext4 are not in the
>> scope of this patchset.
>>
>> Let me describe how to test mounting from non-init user namespaces. It's
>> assumed that tests are done via sshfs, a userspace filesystem based on
>> FUSE with ssh as backend. Testing system is Fedora 27.
>
> In general I am for this work, and more bodies and more eyes on it is
> generally better.
>
> I will review this after the New Year, I am out for the holidays right
> now.

Thanks. I'll wait for your review.

Dongsu

> Eric
>
>
>>
>> ====
>> $ sudo dnf install -y sshfs
>> $ sudo mkdir -p /mnt/userns
>>
>> ### workaround to get the sshfs permission checks
>> $ sudo chown -R $UID:$UID /etc/ssh/ssh_config.d /usr/share/crypto-policies
>>
>> $ unshare -U -r -m
>> # sshfs root@localhost: /mnt/userns
>>
>> ### You can see sshfs being mounted from a non-init user namespace
>> # mount | grep sshfs
>> root@localhost: on /mnt/userns type fuse.sshfs
>> (rw,nosuid,nodev,relatime,user_id=0,group_id=0)
>>
>> # touch /mnt/userns/test
>> # ls -l /mnt/userns/test
>> -rw-r--r-- 1 root root 0 Dec 11 19:01 /mnt/userns/test
>> ====
>>
>> Open another terminal, check the mountpoint from outside the namespace.
>>
>> ====
>> $ grep userns /proc/$(pidof sshfs)/mountinfo
>> 131 102 0:35 / /mnt/userns rw,nosuid,nodev,relatime - fuse.sshfs
>> root@localhost: rw,user_id=0,group_id=0
>> ====
>>
>> After all tests are done, you can unmount the filesystem
>> inside the namespace.
>>
>> ====
>> # fusermount -u /mnt/userns
>> ====
>>
>> Changes since v4:
>> * Remove other parts like ext4 to keep the patchset minimal for FUSE
>> * Add and change commit messages
>> * Describe how to test non-init user namespaces
>>
>> TODO:
>> * Think through potential security implications. There are 2 patches
>> being prepared for security issues. One is "ima: define a new policy
>> option named force" by Mimi Zohar, which adds an option to specify
>> that the results should not be cached:
>> https://marc.info/?l=linux-integrity&m=151275680115856&w=2
>> The other one is to basically prevent FUSE results from being cached,
>> which is still in progress.
>>
>> * Test IMA/LSMs. Details are written in
>> https://github.com/kinvolk/fuse-userns-patches/blob/master/tests/TESTING_INTEGRITY.md
>>
>> Patches 1-2 deal with an additional flag of lookup_bdev() to check for
>> additional inode permission.
>>
>> Patches 3-7 allow the superblock owner to change ownership of inodes, and
>> deal with additional capability checks w.r.t user namespaces.
>>
>> Patches 8-10 allow FUSE filesystems to be mounted outside of the init
>> user namespace.
>>
>> Patch 11 handles a corner case of non-root users in EVM.
>>
>> The patchset is also available in our github repo:
>> https://github.com/kinvolk/linux/tree/dongsu/fuse-userns-v5-1
>>
>>
>> Eric W. Biederman (1):
>> fs: Allow superblock owner to change ownership of inodes
>>
>> Seth Forshee (10):
>> block_dev: Support checking inode permissions in lookup_bdev()
>> mtd: Check permissions towards mtd block device inode when mounting
>> fs: Don't remove suid for CAP_FSETID for userns root
>> fs: Allow superblock owner to access do_remount_sb()
>> capabilities: Allow privileged user in s_user_ns to set security.*
>> xattrs
>> fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
>> fuse: Support fuse filesystems outside of init_user_ns
>> fuse: Restrict allow_other to the superblock's namespace or a
>> descendant
>> fuse: Allow user namespace mounts
>> evm: Don't update hmacs in user ns mounts
>>
>> drivers/md/bcache/super.c | 2 +-
>> drivers/md/dm-table.c | 2 +-
>> drivers/mtd/mtdsuper.c | 6 +++++-
>> fs/attr.c | 34 ++++++++++++++++++++++++++--------
>> fs/block_dev.c | 13 ++++++++++---
>> fs/fuse/cuse.c | 3 ++-
>> fs/fuse/dev.c | 11 ++++++++---
>> fs/fuse/dir.c | 16 ++++++++--------
>> fs/fuse/fuse_i.h | 6 +++++-
>> fs/fuse/inode.c | 35 +++++++++++++++++++++--------------
>> fs/inode.c | 6 ++++--
>> fs/ioctl.c | 4 ++--
>> fs/namespace.c | 4 ++--
>> fs/proc/base.c | 7 +++++++
>> fs/proc/generic.c | 7 +++++++
>> fs/proc/proc_sysctl.c | 7 +++++++
>> fs/quota/quota.c | 2 +-
>> include/linux/fs.h | 2 +-
>> kernel/user_namespace.c | 1 +
>> security/commoncap.c | 8 ++++++--
>> security/integrity/evm/evm_crypto.c | 3 ++-
>> 21 files changed, 127 insertions(+), 52 deletions(-)