Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount

From: Djalal Harouni
Date: Tue Feb 07 2017 - 14:48:29 EST


On Tue, Feb 7, 2017 at 7:20 PM, James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 2017-02-07 at 19:59 +0200, Amir Goldstein wrote:
>> On Tue, Feb 7, 2017 at 6:37 PM, James Bottomley
>> <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
>> > On Tue, 2017-02-07 at 01:19 -0800, Christoph Hellwig wrote:
>> > > On Sat, Feb 04, 2017 at 11:19:32AM -0800, James Bottomley wrote:
>> > > > This allows any subtree to be uid/gid shifted and bound
>> > > > elsewhere.
>> > > > It does this by operating simlarly to overlayfs. Its primary
>> > > > use
>> > > > is for shifting the underlying uids of filesystems used to
>> > > > support
>> > > > unpriviliged (uid shifted) containers. The usual use case here
>> > > > is
>> > > > that the container is operating with an uid shifted
>> > > > unprivileged
>> > > > root but sometimes needs to make use of or work with a
>> > > > filesystem
>> > > > image that has root at real uid 0.
>> > > >
>> > > > The mechanism is to allow any subordinate mount namespace to
>> > > > mount
>> > > > a shiftfs filesystem (by marking it FS_USERNS_MOUNT) but only
>> > > > allowing it to mount marked subtrees (using the -o mark option
>> > > > as
>> > > > root). Once mounted, the subtree is mapped via the super block
>> > > > user namespace so that the interior ids of the mounting user
>> > > > namespace are the ids written to the filesystem.
>> > >
>> > > Please move this into VFS instead of a stackable fs. We might
>> > > need
>> > > addtional parameters to getattr/setattr to specify the ID
>> > > translation, but that's why better than a horrible hack like
>> > > this.
>> >
>> > I would need a lot more than that: getattr controls the cosmetic
>> > permission display to the user, but enforcement is done in the core
>> > permission checks which are inode based. To make this a real bind
>> > mount, the core permission checks will have to become subtree aware
>> > because knowledge of whether we need a uid shift in the permission
>> > check becomes a subtree property. Effectively inode_permission
>> > would
>> > become dentry_permission and generic_permission would take a dentry
>> > instead of an inode. This will be a huge amount of VFS and
>> > underlying
>> > filesystem churn, since the permissions calls are threaded through
>> > a
>> > huge chunk of code.
>> >
>>
>> I am not even sure that would be enough.
>> dentry does not contain information about the mount user came from,
>> and sb contains only information about the user ns of the mounter of
>> the file system, not the mounter of the bind mount, right?
>> I think I am missing some big pieces of the big picture.
>> Would love to hear what Eric has to say.
>
> I'm not really sure until it gets prototyped, but I think the
> filesystem user namespace would also have to become a subtree property.

Sorry I don't want to derail the thread, but that was already prototyped

> The whole reason for shiftfs being a properly mounted filesystem is
> because it needs a super block to capture the namespace it's being
> mounted in.
>
> However, when you have a container that you want remapping inside, you
> must have a user namespace which owns a mount namespace, so we can
> deduce the information from the mount namespace. All we probably need
> the subtree to tell us is if we're shifting or not.

That's one of the use cases that you will definitely end up with... if
anyone did read that incomplete VFS RFC proposal:

"2) The solution is based on VFS and mount namespaces, we use the user
namespace of the containing mount namespace to check if we should shift
UIDs/GIDs from/to virtual <=> on-disk view.
If a filesystem was mounted with "vfs_shift_uids" and "vfs_shift_gids"
options, and if it shows up inside a mount namespace that supports VFS
UIDs/GIDs shifts then during each access we will remap UID/GID either
to virtual or to on-disk view using simple helper functions to allow the
access. In case the mount or current mount namespace do not support VFS
UID/GID shifts, we fallback to the old behaviour, no shift is performed." [1]

[1] https://lkml.org/lkml/2016/5/4/411



--
tixxdz