Re: [RFC][PATCH 3/3] vfs: Lazily remove mounts on unlinked files and directories.

From: Eric W. Biederman
Date: Tue Oct 08 2013 - 17:47:31 EST

Next message: Greg Kroah-Hartman: "Re: [RFC PATCH v2 0/1] FPGA subsystem core"
Previous message: Alex Thorlton: "Re: [PATCHv4 00/10] split page table lock for PMD tables"
In reply to: Miklos Szeredi: "Re: [RFC][PATCH 3/3] vfs: Lazily remove mounts on unlinked files anddirectories."
Next in thread: Eric W. Biederman: "Grrrr fusermount."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Miklos Szeredi <miklos@xxxxxxxxxx> writes:

> On Fri, Oct 04, 2013 at 03:43:56PM -0700, Eric W. Biederman wrote:
>> +/**
>> + * shrink_submounts_and_drop - detach submounts, prune dcache, and drop
>> + *
>> + * All done as a single atomic operation reletaive to d_set_mounted().
>> + *
>> + * @dentry: dentry to detach, prune and drop
>> + */
>> +void shrink_submounts_and_drop(struct dentry *dentry)
>> +{
>> + d_drop(dentry);
>> + detach_submounts(dentry);
>
> And here, between detach_submounts() and shrink_dcache_parent() a new mount can
> be added.

Actually it is not possible to add a new mount betwee detach_submounts
and shrink_dcache_parent d_set_mountpoint will see the unhashed
dentry as an unhashed parent and refuse to add any new mount points.

> It's not accidental that check_submounts_and_drop() did the check and the drop
> together, protected by rename_lock and d_lock.

When I looked the locking of detach_mount prevented me from using the
same technique as check_submounts_and_drop. But happily
d_set_mountpoint now checks for unhashed parents and prevents carnage at
that point.

>> + shrink_dcache_parent(dentry);
>> }
>> -EXPORT_SYMBOL(check_submounts_and_drop);
>> +EXPORT_SYMBOL(shrink_submounts_and_drop);
>>

>> @@ -3988,10 +3983,6 @@ static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry,
>> if (target)
>> mutex_lock(&target->i_mutex);
>>
>> - error = -EBUSY;
>> - if (d_mountpoint(old_dentry) || d_mountpoint(new_dentry))
>> - goto out;
>> -
>
> I know of at least one app that relied at some point on a mountpoint (directory
> or non-directory) not being movable: fusermount uses this to ensure that
> unprivileged userspace didn't try replacing a fuse mount with a symlink to trick
> fusermount into umounting an arbitrary path. The code that relied on this was
> replaced by UMOUNT_NOFOLLOW on kernels where it is supported. But in theory
> there may exist a running binary without UMOUNT_NOFOLLOW and relying on EBUSY.
>
> And there may be other such horrid hacks out there.

That is certainly worth thinking about.

In practice if I rename a parent directory I can rename mount points
today. So I don't know that it was ever fully safe to rely on rename
prevention.

We may be able to keep this rule just for the local mount namespace if
that would help.

If we are going to remove lying to the VFS removing the d_mountpoint
restriction of rename is necessary.

It gets scary to rely on the checks only applying locally, because of
things like mount --bind /some/path /some/other/path, and because of
cases like sysfs and nfs where the mutation path is not through the
local vfs and simply bypasses all of these changes.

So I will think about this case some more but I don't think I can make
this change and remain hack compatile with old kernels.

I hope we don't find any horrible hacks that we absolutely must support
because at best that puts everything back to the drawing board.

But I will go through and read the old fusermount code before I get too
much farther just so I understand what I am potentially breaking.

>> error = -EMLINK;
>> if (max_links && !target && new_dir != old_dir &&
>> new_dir->i_nlink >= max_links)
>> @@ -4006,6 +3997,7 @@ static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry,
>> if (target) {
>> target->i_flags |= S_DEAD;
>> dont_mount(new_dentry);
>> + detach_mounts(new_dentry);
>> }
>> out:
>> if (target)
>> @@ -4031,16 +4023,15 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
>> if (target)
>> mutex_lock(&target->i_mutex);
>>
>> - error = -EBUSY;
>> - if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
>> - goto out;
>> -
>> error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
>> if (error)
>> goto out;
>>
>> - if (target)
>> + if (target) {
>> dont_mount(new_dentry);
>> + detach_mounts(new_dentry);
>> + }
>> + detach_mounts(old_dentry);
>
> Why exactly? "Moved file changes contents" is not the least surprising result,
> IMO. And why the difference between rename-dir and rename-other in
> this regard?

Because detach_mounts(old_dentry) is a bug.

detach_mounts(new_dentry) is needed so that it is safe to put new_dentry
a few lines

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Greg Kroah-Hartman: "Re: [RFC PATCH v2 0/1] FPGA subsystem core"
Previous message: Alex Thorlton: "Re: [PATCHv4 00/10] split page table lock for PMD tables"
In reply to: Miklos Szeredi: "Re: [RFC][PATCH 3/3] vfs: Lazily remove mounts on unlinked files anddirectories."
Next in thread: Eric W. Biederman: "Grrrr fusermount."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]