Re: [RFC PATCH] fs: add AT_EMPTY_PATH support to unlinkat()

From: Al Viro
Date: Sun Oct 08 2023 - 22:06:39 EST


On Fri, Sep 29, 2023 at 03:04:56PM +0100, Luis Henriques wrote:

> -int do_unlinkat(int dfd, struct filename *name)
> +int do_unlinkat(int dfd, struct filename *name, int flags)
> {
> int error;
> - struct dentry *dentry;
> + struct dentry *dentry, *parent;
> struct path path;
> struct qstr last;
> int type;
> struct inode *inode = NULL;
> struct inode *delegated_inode = NULL;
> unsigned int lookup_flags = 0;
> -retry:
> - error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
> - if (error)
> - goto exit1;
> + bool empty_path = (flags & AT_EMPTY_PATH);
>
> - error = -EISDIR;
> - if (type != LAST_NORM)
> - goto exit2;
> +retry:
> + if (empty_path) {
> + error = filename_lookup(dfd, name, 0, &path, NULL);
> + if (error)
> + goto exit1;
> + parent = path.dentry->d_parent;
> + dentry = path.dentry;
> + } else {
> + error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type);
> + if (error)
> + goto exit1;
> + error = -EISDIR;
> + if (type != LAST_NORM)
> + goto exit2;
> + parent = path.dentry;
> + }
>
> error = mnt_want_write(path.mnt);
> if (error)
> goto exit2;
> retry_deleg:
> - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT);
> - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags);
> + inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> + if (!empty_path)
> + dentry = lookup_one_qstr_excl(&last, parent, lookup_flags);

For starters, your 'parent' might have been freed under you, just as you'd
been trying to lock its inode. Or it could have become negative just as you'd
been fetching its ->d_inode, while we are at it.

Races aside, you are changing permissions required for removing files. For
unlink() you need to be able to get to the parent directory; if it's e.g.
outside of your namespace, you can't do anything to it. If file had been
opened there by somebody who could reach it and passed to you (via SCM_RIGHTS,
for example) you currently can't remove the sucker. With this change that
is no longer true.

The same goes for the situation when file is bound into your namespace (or
chroot jail, for that matter). path.dentry might very well be equal to
root of path.mnt; path.dentry->d_parent might be in part of tree that is
no longer visible *anywhere*. rmdir() should not be able to do anything
with it...

IMO it's fundamentally broken; not just implementation, but the concept
itself.

NAKed-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>