Re: [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems

From: Andy Lutomirski
Date: Wed May 04 2016 - 23:30:05 EST


On May 4, 2016 7:25 PM, "Dave Chinner" <david@xxxxxxxxxxxxx> wrote:
>
> On Wed, May 04, 2016 at 06:44:14PM -0700, Andy Lutomirski wrote:
> > On Wed, May 4, 2016 at 5:23 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > > On Wed, May 04, 2016 at 04:26:46PM +0200, Djalal Harouni wrote:
> > >> This is version 2 of the VFS:userns support portable root filesystems
> > >> RFC. Changes since version 1:
> > >>
> > >> * Update documentation and remove some ambiguity about the feature.
> > >> Based on Josh Triplett comments.
> > >> * Use a new email address to send the RFC :-)
> > >>
> > >>
> > >> This RFC tries to explore how to support filesystem operations inside
> > >> user namespace using only VFS and a per mount namespace solution. This
> > >> allows to take advantage of user namespace separations without
> > >> introducing any change at the filesystems level. All this is handled
> > >> with the virtual view of mount namespaces.
> > >
> > > [...]
> > >
> > >> As an example if the mapping 0:65535 inside mount namespace and outside
> > >> is 1000000:1065536, then 0:65535 will be the range that we use to
> > >> construct UIDs/GIDs mapping into init_user_ns and use it for on-disk
> > >> data. They represent the persistent values that we want to write to the
> > >> disk. Therefore, we don't keep track of any UID/GID shift that was applied
> > >> before, it gives portability and allows to use the previous mapping
> > >> which was freed for another root filesystem...
> > >
> > > So let me get this straight. Two /isolated/ containers, different
> > > UID/GID mappings, sharing the same files and directories. Create a
> > > new file in a writeable directory in container 1, namespace
> > > information gets stripped from on-disk uid/gid representation.
> >
> > I think the intent is a totally separate superblock for each
> > container. Djalal, am I right?
>
> I'm pretty sure you can't have multiple superblocks point to the
> same backing device. Each superblock would then think it's the sole
> owner of the filesystem and all we get out of that is incoherent
> caching and a corrupt on-disk filesystem.

I meant separate backing stores, too.

--Andy

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx