Re: [PATCH] fanotify: use killable wait for waiting response for permission events

From: Jan Kara
Date: Mon Aug 20 2018 - 06:53:32 EST


Hi!

On Mon 20-08-18 10:09:42, Konstantin Khlebnikov wrote:
> Waiting in uninterruptible state for response from userspace
> easily produces deadlocks and hordes of unkillable tasks.
>
> This patch makes this wait killable.
>
> At receiving fatal signal task will remove queued event and die.
> If event is already handled then response will be received as usual.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>

Thanks for the patch. I like the idea. Some comments inline.

> ---
> fs/notify/fanotify/fanotify.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index eb4e75175cfb..7a0c37790c89 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -64,7 +64,27 @@ static int fanotify_get_response(struct fsnotify_group *group,
>
> pr_debug("%s: group=%p event=%p\n", __func__, group, event);
>
> - wait_event(group->fanotify_data.access_waitq, event->response);
> + ret = wait_event_killable(group->fanotify_data.access_waitq,
> + event->response);
> + if (ret) {
> + /* Try to remove pending event from the queue */
> + spin_lock(&group->notification_lock);
> + if (!list_empty(&event->fae.fse.list))
> + list_del_init(&event->fae.fse.list);

Here you forget to decrement group->q_len like
fsnotify_remove_first_event() does.

> + else
> + ret = 0;
> + spin_unlock(&group->notification_lock);

So the above check for list_empty can hit either when response is just
being processed (and then we'll be woken up very soon) or when the event is
just in the process of being copied from event queue to userspace (in which
case we are in the same situation as in the old code). So it would be
weird that in rare cases wait would not be really killable. I think we
could detect this situation in fanotify_read() before adding event to
access_list and just wakeup waiter in fanotify_get_response() again and
avoid reporting the event to userspace. Hmm?

Honza

> +
> + if (ret)
> + return ret;
> +
> + /*
> + * We cannot return, this will destroy event while
> + * process_access_response() fills response.
> + * Just wait for wakeup and continue normal flow.
> + */
> + wait_event(group->fanotify_data.access_waitq, event->response);
> + }
>
> /* userspace responded, convert to something usable */
> switch (event->response & ~FAN_AUDIT) {
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR