Re: [PATCH 00/32] VFS: Introduce filesystem context [ver #8]

From: Eric W. Biederman
Date: Fri Jun 15 2018 - 00:19:16 EST

David Howells <dhowells@xxxxxxxxxx> writes:

> Here are a set of patches to create a filesystem context prior to setting
> up a new mount, populating it with the parsed options/binary data, creating
> the superblock and then effecting the mount. This is also used for remount
> since much of the parsing stuff is common in many filesystems.

I have read through these patches and I noticed a significant issue.

Today in mount_bdev we do something that looks like:

s = sget(..., bdev);
if (s->s_root) {
/* Noop */
} else {
err = fill_super(s, ...);
if (err) {
return ERR_PTR(err);
s->s_flags |= SB_ATTIVE;
bdev->bd_super = s;
return dget(s->s_root);

The key point is that we don't process the mount options at all if
a super block already exists in the kernel. Similar to what
your fscontext changes are doing (after parsing the options).

Your fscontext changes do not improve upon this area of the mount api at
all and that concerns me. This is an area where people can and already
do shoot themselves in their feet.

The real world security issue we had in with this involved devpts. The
devpts filesystem requires the mode and gid parameters for new ttys to
be specified to be posix compliant. People were setting up chroot
environments and mounting devpts with the wrong arguments. As these two
devpts mounts shared a super block a change of arguments on one was a
change of arguments on the other. Which mean the chroots were
periodically breaking the primary devpts and causing new terminals to be
opened with essentially unusable permissions. Fun when you are trying
to ssh in to a box.

Creating a new mount and finding an old mount are the same operation in
the kernel today. This is fundamentally confusing. In the new api
could we please separate these two operations?

Perhaps someting like:
x create
x find

With the "x create" case failing if the filesystem already exists,
still allowing "x find"? And with the "x find" case failing if
the superblock is not already created in the kernel.

That should make it clear to a userspace program what is going on
and give it a chance to mount a filesystem anyway.

In a similar vein could we please clarify the rules for changing mount
options for an existing superblock are in the new api?

Today mount assumes that it has to provide all of the existing options
to reconfigure a mount. What people want to do and what most
filesystems support is just specifying the options that need to be
changed. Can we please make this the rule of how this are expected
to work for fscontext? That only changing mount options need to
be specified before: "x reconfigure"