Re: [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems

From: Dave Chinner
Date: Wed May 04 2016 - 22:25:39 EST


On Wed, May 04, 2016 at 06:44:14PM -0700, Andy Lutomirski wrote:
> On Wed, May 4, 2016 at 5:23 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > On Wed, May 04, 2016 at 04:26:46PM +0200, Djalal Harouni wrote:
> >> This is version 2 of the VFS:userns support portable root filesystems
> >> RFC. Changes since version 1:
> >>
> >> * Update documentation and remove some ambiguity about the feature.
> >> Based on Josh Triplett comments.
> >> * Use a new email address to send the RFC :-)
> >>
> >>
> >> This RFC tries to explore how to support filesystem operations inside
> >> user namespace using only VFS and a per mount namespace solution. This
> >> allows to take advantage of user namespace separations without
> >> introducing any change at the filesystems level. All this is handled
> >> with the virtual view of mount namespaces.
> >
> > [...]
> >
> >> As an example if the mapping 0:65535 inside mount namespace and outside
> >> is 1000000:1065536, then 0:65535 will be the range that we use to
> >> construct UIDs/GIDs mapping into init_user_ns and use it for on-disk
> >> data. They represent the persistent values that we want to write to the
> >> disk. Therefore, we don't keep track of any UID/GID shift that was applied
> >> before, it gives portability and allows to use the previous mapping
> >> which was freed for another root filesystem...
> >
> > So let me get this straight. Two /isolated/ containers, different
> > UID/GID mappings, sharing the same files and directories. Create a
> > new file in a writeable directory in container 1, namespace
> > information gets stripped from on-disk uid/gid representation.
>
> I think the intent is a totally separate superblock for each
> container. Djalal, am I right?

I'm pretty sure you can't have multiple superblocks point to the
same backing device. Each superblock would then think it's the sole
owner of the filesystem and all we get out of that is incoherent
caching and a corrupt on-disk filesystem.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx