Re: [PATCH v5] Add a "nosymfollow" mount option.

From: Ross Zwisler
Date: Thu Feb 06 2020 - 14:10:53 EST


On Tue, Feb 4, 2020 at 8:45 PM Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
> On 2020-02-04, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > On Tue, Feb 04, 2020 at 04:49:48PM -0700, Ross Zwisler wrote:
> > > On Tue, Feb 4, 2020 at 3:11 PM Ross Zwisler <zwisler@xxxxxxxxxxxx> wrote:
> > > > On Tue, Feb 4, 2020 at 2:53 PM Raul Rangel <rrangel@xxxxxxxxxx> wrote:
> > > > > > --- a/include/uapi/linux/mount.h
> > > > > > +++ b/include/uapi/linux/mount.h
> > > > > > @@ -34,6 +34,7 @@
> > > > > > #define MS_I_VERSION (1<<23) /* Update inode I_version field */
> > > > > > #define MS_STRICTATIME (1<<24) /* Always perform atime updates */
> > > > > > #define MS_LAZYTIME (1<<25) /* Update the on-disk [acm]times lazily */
> > > > > > +#define MS_NOSYMFOLLOW (1<<26) /* Do not follow symlinks */
> > > > > Doesn't this conflict with MS_SUBMOUNT below?
> > > > > >
> > > > > > /* These sb flags are internal to the kernel */
> > > > > > #define MS_SUBMOUNT (1<<26)
> > > >
> > > > Yep. Thanks for the catch, v6 on it's way.
> > >
> > > It actually looks like most of the flags which are internal to the
> > > kernel are actually unused (MS_SUBMOUNT, MS_NOREMOTELOCK, MS_NOSEC,
> > > MS_BORN and MS_ACTIVE). Several are unused completely, and the rest
> > > are just part of the AA_MS_IGNORE_MASK which masks them off in the
> > > apparmor LSM, but I'm pretty sure they couldn't have been set anyway.
> > >
> > > I'll just take over (1<<26) for MS_NOSYMFOLLOW, and remove the rest in
> > > a second patch.
> > >
> > > If someone thinks these flags are actually used by something and I'm
> > > just missing it, please let me know.
> >
> > Afraid you did miss it ...
> >
> > /*
> > * sb->s_flags. Note that these mirror the equivalent MS_* flags where
> > * represented in both.
> > */
> > ...
> > #define SB_SUBMOUNT (1<<26)
> >
> > It's not entirely clear to me why they need to be the same, but I haven't
> > been paying close attention to the separation of superblock and mount
> > flags, so someone else can probably explain the why of it.
>
> I could be wrong, but I believe this is historic and originates from the
> kernel setting certain flags internally (similar to the whole O_* flag,
> "internal" O_* flag, and FMODE_NOTIFY mixup).
>
> Also, one of the arguments for the new mount API was that we'd run out
> MS_* bits so it's possible that you have to enable this new mount option
> in the new mount API only. (Though Howells is the right person to talk
> to on this point.)

As far as I can tell, SB_SUBMOUNT doesn't actually have any dependence on
MS_SUBMOUNT. Nothing ever sets or checks MS_SUBMOUNT from within the kernel,
and whether or not it's set from userspace has no bearing on how SB_SUBMOUNT
is used. SB_SUBMOUNT is set independently inside of the kernel in
vfs_submount().

I agree that their association seems to be historical, introduced in this
commit from David Howells:

e462ec50cb5fa VFS: Differentiate mount flags (MS_*) from internal superblock flags

In that commit message David notes:

(1) Some MS_* flags get translated to MNT_* flags (such as MS_NODEV ->
MNT_NODEV) without passing this on to the filesystem, but some
filesystems set such flags anyway.

I think this is sort of what we are trying to do with MS_NOSYMFOLLOW: have a
userspace flag that translates to MNT_NOSYMFOLLOW, but which doesn't need an
associated SB_* flag. Is it okay to reclaim the bit currently owned by
MS_SUBMOUNT and use it for MS_NOSYMFOLLOW.

A second option would be to choose one of the unused MS_* values from the
middle of the range, such as 256 or 512. Looking back as far as git will let
me, I don't think that these flags have been used for MS_* values at least
since v2.6.12:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/fs.h?id=1da177e4c3f41524e886b7f1b8a0c1fc7321cac2

I think maybe these used to be S_WRITE and S_APPEND, which weren't filesystem
mount flags?

https://sites.uclouvain.be/SystInfo/usr/include/sys/mount.h.html

A third option would be to create this flag using the new mount system:

https://lwn.net/Articles/753473/
https://lwn.net/Articles/759499/

My main concern with this option is that for Chrome OS we'd like to be able to
backport whatever solution we come up with to a variety of older kernels, and
if we go with the new mount system this would require us to backport the
entire new mount system to those kernels, which I think is infeasible.

David, what are your thoughts on this? Of these three options for supporting
a new MS_NOSYMFOLLOW flag:

1) reclaim the bit currently used by MS_SUBMOUNT
2) use a smaller unused value for the flag, 256 or 512
3) implement the new flag only in the new mount system

do you think either #1 or #2 are workable? If so, which would you prefer?