Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]
From: Al Viro
Date: Sat Jun 02 2018 - 13:50:08 EST
On Sat, Jun 02, 2018 at 04:45:21PM +0100, David Howells wrote:
> Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > TBH, I would probably prefer separate mount_setattr(2) for that kind
> > of work, with something like
> > int mount_setattr(int dirfd, const char *path, int flags, int attr)
> > *not* opening any files.
> > flags:
> > AT_EMPTY_PATH, AT_NO_AUTOMOUNT, AT_RECURSIVE
> I would call these MOUNT_SETATTR_* rather than AT_*.
Why? AT_EMPTY_PATH/AT_NO_AUTOMOUNT are common with other ...at()
syscalls; AT_RECURSIVE - maybe, but it's still more like AT_...
namespace fodder, IMO.
> > attr:
> > MOUNT_SETATTR_DEV (1<<0)
> > MOUNT_SETATTR_NODEV (1<<0)|(1<<1)
> > MOUNT_SETATTR_EXEC (1<<2)
> > MOUNT_SETATTR_NOEXEC (1<<2)|(1<<3)
> > MOUNT_SETATTR_SUID (1<<4)
> > MOUNT_SETATTR_NOSUID (1<<4)|(1<<5)
> > MOUNT_SETATTR_RW (1<<6)
> > MOUNT_SETATTR_RO (1<<6)|(1<<7)
> > MOUNT_SETATTR_RELATIME (1<<8)
> > MOUNT_SETATTR_NOATIME (1<<8)|(1<<9)
> > MOUNT_SETATTR_NODIRATIME (1<<8)|(2<<9)
> > MOUNT_SETATTR_STRICTATIME (1<<8)|(3<<9)
> > MOUNT_SETATTR_PRIVATE (1<<11)
> > MOUNT_SETATTR_SHARED (1<<11)|(1<<12)
> > MOUNT_SETATTR_SLAVE (1<<11)|(2<<12)
> > MOUNT_SETATTR_UNBINDABLE (1<<11)|(3<<12)
> So, I like this generally, some notes though:
> I wonder if this should be two separate parameters, a mask and the settings?
> I'm not sure that's worth it since some of the mask bits would cover multiple
Nah, better put those bits in the same word, as in above. Here bits 0, 2, 4, 6,
8 and 11 tell which attributes are to be modified, with values to set living
in bits 1, 3, 5, 7, 9--10 and 12--13. Look at the constants above..
> Also, should NODIRATIME be separate from the other *ATIME flags? I do also
> like treating some of these settings as enumerations rather than a set of
Huh? That's precisely what I'm doing there: bit 8 is "want to change atime
settings", bits 9 and 10 hold a 4-element enumeration (rel/no/nodir/strict).
Similar for propagation settings (bit 11 indicates that we want to set those,
bits 12 and 13 - 4-element enum)...
> I would make the prototype:
> int mount_setattr(int dirfd, const char *path,
> unsigned int flags, unsigned int attr,
> void *reserved5);
> Further, do we want to say you can either change the propagation type *or*
> reconfigure the mountpoint restrictions, but not both at the same time?
Why? MOUNT_SETATTR_PRIVATE | MOUNT_NOATIME | MOUNT_SUID, i.e. 00101100010000,
i.e. 0xb10 for "turn nosuid off, switch atime polcy to noatime, change propagation
to private, leave everything else as-is"...
And for fsck sake, what's that "void *reserved5" for?
> > With either openat() used as in this series, or explicit
> > int open_tree(int dirfd, const char *path, int flags)
> Maybe open_mount(), grab_mount() or pick_mount()?
> I wonder if fsopen()/fspick() should be create_fs()/open_fs()...
> > returning a descriptor, with flags being
> > AT_EMPTY_PATH, AT_NO_AUTOMOUNT, AT_RECURSIVE, AT_CLONE
> > with AT_RECURSIVE without AT_CLONE being an error.
> You also need an O_CLOEXEC equivalent.
> I would make it:
> OPEN_TREE_CLOEXEC 0x00000001
Why not O_CLOEXEC, as with epoll_create()/timerfd_create()/etc.?
> OPEN_TREE_EMPTY_PATH 0x00000002
> OPEN_TREE_FOLLOW_SYMLINK 0x00000004
> OPEN_TREE_NO_AUTOMOUNT 0x00000008
Why? How are those different from normal AT_EMPTY_PATH/AT_NO_AUTOMOUNT?
> OPEN_TREE_CLONE 0x00000010
> OPEN_TREE_RECURSIVE 0x00000020
> adding the follow-symlinks so that you don't grab a symlink target by
> accident. (Can you actually mount on top of a symlink?)
You can't - mount(2) uses LOOKUP_FOLLOW for mountpoint (well, user_path(),
> > Hell, might even add AT_UMOUNT for "open root and detach, to be dissolved on
> > close", incompatible with AT_CLONE.
> Cute. Guess you could do:
> fd = open_mount(..., OPEN_TREE_DETACH);
> mount_setattr(fd, "",
> MOUNT_SETATTR_NOSUID, NULL);
> move_mount(fd, "", ...);