Re: [RFC][PATCH 00/23] VFS: Introduce superblock configuration context [ver #4]

From: David Howells
Date: Tue May 30 2017 - 11:36:21 EST


Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:

> Random notes:
> * "sb_config" looks rather odd in the current variant; mount_context,
> perhaps? Or fs_context, for that matter... Anyway, that's trivial.

You can argue that one with MiklÃs. He argued against mount_context as I had
it originally. His point is that the same struct may be used when
reconfiguring an sb - which isn't exactly a mount operation (even though we do
it that day today with remount).

> * if NFS folks want to play with EXPORT_SYMBOL_GPL, fine, but any
> EXPORT_SYMBOL_GPL in vfs proper is a mistake. If it's an interface that
> makes sense, just export it; if it's a vewwwy, vewwwy pwiwate interface
> for some specific module - let's figure out how to deal with that layering
> violation rather than exporting it at all.

I agree, but apparently not everyone does. There are _GPL symbols in the core
VFS that I need to replace.

> * what the hell is ms_flags thing doing in __vfs_new_sb_config()?
> It's a really vile mix of unrelated flags and operations we had in existing
> mount(2) ABI. With MS_KERNMOUNT thrown into that loo^Wmix. Sure, we need
> to parse the garbage fed to mount(2). And we need to pass that garbage to
> "legacy" types as well, but let's not inflict it upon the new mechanisms.

I know, but we might get it from mount(2). I can tamp down the flag mask and
translate it from MS_*, but the MS_* flags are also stored in the superblock
(->s_flags).

I've removed the MNT_* flags from there already.

> * what's wrong with simple_pin_fs() as it is? You keep
> vfs_kern_mount() anyway, so...

I would like to replace vfs_kern_mount() and vfs_submount(), with the _sc
versions but the users would need converting first. It might make sense to
retain an __init variant of the former though.

> * vfs_new_sb_config(): please, move dealing with name into the caller.
> Then you would be able to use it more than once.

Technically, it's used twice, but okay. I guess I should just rename
__vfs_new_sb_config() to vfs_new_sb_config() and add the extra parameters to
the caller.

> * submount side of that thing: do we ever want a type different from
> that of src_sb,

Hmmm... Good question. For the moment I've assumed not. I've killed off the
NFS special types since I can now carry the information in the sb_config
struct that they previously conveyed.

> and how the fuck would methods know what to do with it?

Until I have an example, it's hard to say.

> * remounts: where (if anywhere) do you call ->validate() for those,

It got moved out of the path that revalidate was invoking. I need to put it
back.

However, it may be worth leaving this to the filesystem to invoke during
->get_tree() and ->remount_fs() as it then has access to the on-disk fs
metadata if a blockdev is being used, against which it may need to do
validation.

The biggest advantage of having a separate call is that the argument
combination can be validated before taking any locks, opening a blockdev or
sending packets on the network.

> and if you do not, WTF is this
> + if (cfg->sc.purpose == SB_CONFIG_FOR_REMOUNT)
> + return 0;
> for? You know, the only place that ever looks at ->purpose...

That being the only place is true at the moment, but may not remain so as more
filesystems are converted.

> * docs need to be brought in sync with code - 'purpose' is called 'mount_type'
> in those, which is especially unpleasant since you do introduce a field called just
> that - NFS-only and in NFS-private part.

Yep.

> * you don't need to register filesystem to use kern_mount()

Hmmm... I'm not sure whether that's actually a problem.

> * locking inode in fsmount(2). What for?

Yeah, I can get rid of that. The superblock-getting bit used to be done after
this point, so the lock was necessary to prevent a race.

> * ->sb_mountpoint(). YALinuxSadoMasochismHook. Not called on normal
> mount(2) pathway. Yuck...

That replaces security_sb_kern_mount(). That should move into
do_new_mount_sc().

> * could you split whitespace parts off? Minor, but...

You mean patch 2? You could just take that one patch and apply it/pass it to
Linus, then I could rebase.

> * I'd like to see ipc/mqueue.c dealt with as well; feels like procfs
> counterpart might have too much open-coded. This would show what might be
> folded into saner helpers...

Okay. Any other file system types you'd like to see done immediately?
cpuset, maybe?

I still have to finish the ext4 conversion too.

David