Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount

From: Amir Goldstein
Date: Tue Feb 07 2017 - 12:59:07 EST


On Tue, Feb 7, 2017 at 6:37 PM, James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 2017-02-07 at 01:19 -0800, Christoph Hellwig wrote:
>> On Sat, Feb 04, 2017 at 11:19:32AM -0800, James Bottomley wrote:
>> > This allows any subtree to be uid/gid shifted and bound elsewhere.
>> > It does this by operating simlarly to overlayfs. Its primary use
>> > is for shifting the underlying uids of filesystems used to support
>> > unpriviliged (uid shifted) containers. The usual use case here is
>> > that the container is operating with an uid shifted unprivileged
>> > root but sometimes needs to make use of or work with a filesystem
>> > image that has root at real uid 0.
>> >
>> > The mechanism is to allow any subordinate mount namespace to mount
>> > a shiftfs filesystem (by marking it FS_USERNS_MOUNT) but only
>> > allowing it to mount marked subtrees (using the -o mark option as
>> > root). Once mounted, the subtree is mapped via the super block
>> > user namespace so that the interior ids of the mounting user
>> > namespace are the ids written to the filesystem.
>>
>> Please move this into VFS instead of a stackable fs. We might need
>> addtional parameters to getattr/setattr to specify the ID
>> translation, but that's why better than a horrible hack like this.
>
> I would need a lot more than that: getattr controls the cosmetic
> permission display to the user, but enforcement is done in the core
> permission checks which are inode based. To make this a real bind
> mount, the core permission checks will have to become subtree aware
> because knowledge of whether we need a uid shift in the permission
> check becomes a subtree property. Effectively inode_permission would
> become dentry_permission and generic_permission would take a dentry
> instead of an inode. This will be a huge amount of VFS and underlying
> filesystem churn, since the permissions calls are threaded through a
> huge chunk of code.
>

I am not even sure that would be enough.
dentry does not contain information about the mount user came from,
and sb contains only information about the user ns of the mounter of
the file system, not the mounter of the bind mount, right?
I think I am missing some big pieces of the big picture.
Would love to hear what Eric has to say.