Re: [RFC PATCH] vfs: shutdown lease notifications on file close
From: Jeff Layton
Date: Fri Oct 13 2017 - 14:31:02 EST
On Fri, 2017-10-13 at 08:56 -0700, Dan Williams wrote:
> While implementing MAP_DIRECT, an mmap flag that arranges for an
> FL_LAYOUT lease to be established, Al noted:
>
> You are not even guaranteed that descriptor will remain be still
> open by the time you pass it down to your helper, nevermind the
> moment when event actually happens...
>
> The first problem can be solved with an fd{get,put} at mmap
> {entry,exit}. The second problem appears to be a general issue.
>
> Leases follow the lifetime of the inode, so it is possible for a lease
> to be broken after the file is closed. When that happens userspace may
> get a notification on a stale fd. Of course it is not recommended that a
> process close a file descriptor with an active lease, but if it does we
> should assume that the notification is not needed either. Walk leases at
> close time and invalidate any pending fasync instances.
>
> Cc: Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
> Cc: "J. Bruce Fields" <bfields@xxxxxxxxxxxx>
> Reported-by: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> ---
> fs/fcntl.c | 24 +++++++++++++++++++++++-
> fs/file.c | 1 +
> fs/locks.c | 22 ++++++++++++++++++++++
> include/linux/fs.h | 7 +++++++
> 4 files changed, 53 insertions(+), 1 deletion(-)
>
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index 448a1119f0be..03612c363b90 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -974,6 +974,28 @@ int fasync_helper(int fd, struct file * filp, int on, struct fasync_struct **fap
>
> EXPORT_SYMBOL(fasync_helper);
>
> +static void __fasync_silence(int fd, struct fasync_struct *fa)
> +{
> + while (fa) {
> + unsigned long flags;
> +
> + spin_lock_irqsave(&fa->fa_lock, flags);
> + if (fa->fa_file && fa->fa_fd == fd)
> + fa->fa_fd = -1;
> + spin_unlock_irqrestore(&fa->fa_lock, flags);
> + fa = rcu_dereference(fa->fa_next);
> + }
> +}
> +
> +void fasync_silence(int fd, struct fasync_struct **fp)
> +{
> + if (*fp) {
> + rcu_read_lock();
> + __fasync_silence(fd, *fp);
> + rcu_read_unlock();
> + }
> +}
> +
> /*
> * rcu_read_lock() is held
> */
> @@ -989,7 +1011,7 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
> return;
> }
> spin_lock_irqsave(&fa->fa_lock, flags);
> - if (fa->fa_file) {
> + if (fa->fa_file && fa->fa_fd >= 0) {
> fown = &fa->fa_file->f_owner;
> /* Don't send SIGURG to processes which have not set a
> queued signum: SIGURG has its own default signalling
> diff --git a/fs/file.c b/fs/file.c
> index 1fc7fbbb4510..b90969bf1f94 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -633,6 +633,7 @@ int __close_fd(struct files_struct *files, unsigned fd)
> rcu_assign_pointer(fdt->fd[fd], NULL);
> __clear_close_on_exec(fd, fdt);
> __put_unused_fd(files, fd);
> + locks_silence_lease(fd, file);
> spin_unlock(&files->file_lock);
> return filp_close(file, files);
>
> diff --git a/fs/locks.c b/fs/locks.c
> index 1bd71c4d663a..ca93e4dbdd90 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2573,6 +2573,28 @@ void locks_remove_file(struct file *filp)
> spin_unlock(&ctx->flc_lock);
> }
>
> +/*
> + * The fd is assumed to valid, i.e. this routine is called in the
> + * filp_close() path where the state of fd is known.
> + */
> +void locks_silence_lease(int fd, struct file *filp)
> +{
> + struct file_lock_context *ctx;
> + struct file_lock *fl;
> +
> + ctx = smp_load_acquire(&locks_inode(filp)->i_flctx);
> + if (!ctx || list_empty_careful(&ctx->flc_lease))
> + return;
> +
> + spin_lock(&ctx->flc_lock);
> + list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
> + if (fl->fl_pid != current->tgid)
> + continue;
> + fasync_silence(fd, &fl->fl_fasync);
> + }
> + spin_unlock(&ctx->flc_lock);
> +}
> +
> /**
> * posix_unblock_lock - stop waiting for a file lock
> * @waiter: the lock which was waiting
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 17e0e899e184..019853a7b2cd 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1076,6 +1076,7 @@ extern void locks_copy_lock(struct file_lock *, struct file_lock *);
> extern void locks_copy_conflock(struct file_lock *, struct file_lock *);
> extern void locks_remove_posix(struct file *, fl_owner_t);
> extern void locks_remove_file(struct file *);
> +extern void locks_silence_lease(int, struct file *);
> extern void locks_release_private(struct file_lock *);
> extern void posix_test_lock(struct file *, struct file_lock *);
> extern int posix_lock_file(struct file *, struct file_lock *, struct file_lock *);
> @@ -1153,6 +1154,11 @@ static inline void locks_remove_posix(struct file *filp, fl_owner_t owner)
> return;
> }
>
> +static inline void locks_silence_lease(int fd, struct file *filp)
> +{
> + return;
> +}
> +
> static inline void locks_remove_file(struct file *filp)
> {
> return;
> @@ -1260,6 +1266,7 @@ extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasy
> extern int fasync_remove_entry(struct file *, struct fasync_struct **);
> extern struct fasync_struct *fasync_alloc(void);
> extern void fasync_free(struct fasync_struct *);
> +extern void fasync_silence(int, struct fasync_struct **);
>
> /* can be called from interrupts */
> extern void kill_fasync(struct fasync_struct **, int, int);
All remaning file leases associated with a particular struct file should
be released at last fput via locks_remove_lease.
So yes, you might get the SIGIO after you've already closed the file if
there are lingering filp references out there. You might even get the
signal after you've already recycled the the fd. That's potentially a
real problem, IMO (and I guess it's the one you're wanting to address).
FL_LAYOUT leases are not exposed to userland right now, so I think we
can change the semantics there, as long as it doesn't break knfsd's use
of them (and I wouldn't think that it would). FL_LEASE is more
debatable, just because there are userland callers out there (as Al
points out).
Ok, just saw the note about waiting until you talk to the RDMA folks.
Let us know if you want to look at this more closely.
--
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>