[RFC PATCH 12/15] epoll: support polling from userspace for ep_remove()

From: Roman Penyaev
Date: Wed Jan 09 2019 - 11:41:08 EST


When epfd is polled from userspace and item is being removed:

1. Mark user item as freed. If userspace has not been yet consumed
ready event - route all events to kernel lists.
2. If shrink is required - route all events to kernel lists.
3. On unregistration of epoll entries do not forget to flush item worker,
which can be just submitted from ep_poll_callback()

Signed-off-by: Roman Penyaev <rpenyaev@xxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Davidlohr Bueso <dbueso@xxxxxxx>
Cc: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrea Parri <andrea.parri@xxxxxxxxxxxxxxxxxxxx>
Cc: linux-fsdevel@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
fs/eventpoll.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 2af849e6c7a5..7732a8029a1c 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -780,6 +780,14 @@ static void ep_unregister_pollwait(struct eventpoll *ep, struct epitem *epi)
ep_remove_wait_queue(pwq);
kmem_cache_free(pwq_cache, pwq);
}
+ if (ep_polled_by_user(ep)) {
+ /*
+ * Events polled by user require offloading to a work,
+ * thus we have to be sure everything which was queued
+ * has run to a completion.
+ */
+ flush_work(&epi->work);
+ }
}

/* call only when ep->mtx is held */
@@ -1168,6 +1176,7 @@ static bool ep_add_event_to_uring(struct epitem *epi, __poll_t pollflags)
static int ep_remove(struct eventpoll *ep, struct epitem *epi)
{
struct file *file = epi->ffd.file;
+ bool events_to_klists = false;

lockdep_assert_irqs_enabled();

@@ -1183,9 +1192,14 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)

rb_erase_cached(&epi->rbn, &ep->rbr);

+ if (ep_polled_by_user(ep))
+ events_to_klists = ep_free_user_item(epi);
+
write_lock_irq(&ep->lock);
if (ep_is_linked(epi))
list_del_init(&epi->rdllink);
+ if (events_to_klists)
+ ep_route_events_to_klists(ep);
write_unlock_irq(&ep->lock);

wakeup_source_unregister(ep_wakeup_source(epi));
--
2.19.1