[PATCH 3.16 181/294] epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove()

From: Ben Hutchings
Date: Mon Nov 06 2017 - 19:05:15 EST


3.16.50-rc1 review patch. If anyone has any objections, please let me know.

------------------

From: Oleg Nesterov <oleg@xxxxxxxxxx>

commit 138e4ad67afd5c6c318b056b4d17c17f2c0ca5c0 upstream.

The race was introduced by me in commit 971316f0503a ("epoll:
ep_unregister_pollwait() can use the freed pwq->whead"). I did not
realize that nothing can protect eventpoll after ep_poll_callback() sets
->whead = NULL, only whead->lock can save us from the race with
ep_free() or ep_remove().

Move ->whead = NULL to the end of ep_poll_callback() and add the
necessary barriers.

TODO: cleanup the ewake/EPOLLEXCLUSIVE logic, it was confusing even
before this patch.

Hopefully this explains use-after-free reported by syzcaller:

BUG: KASAN: use-after-free in debug_spin_lock_before
...
_raw_spin_lock_irqsave+0x4a/0x60 kernel/locking/spinlock.c:159
ep_poll_callback+0x29f/0xff0 fs/eventpoll.c:1148

this is spin_lock(eventpoll->lock),

...
Freed by task 17774:
...
kfree+0xe8/0x2c0 mm/slub.c:3883
ep_free+0x22c/0x2a0 fs/eventpoll.c:865

Fixes: 971316f0503a ("epoll: ep_unregister_pollwait() can use the freed pwq->whead")
Reported-by: èéé <long7573@xxxxxxx>
Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
[bwh: Backported to 3.16: EPOLLEXCLUSIVE is not supported]
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -518,8 +518,13 @@ static void ep_remove_wait_queue(struct
wait_queue_head_t *whead;

rcu_read_lock();
- /* If it is cleared by POLLFREE, it should be rcu-safe */
- whead = rcu_dereference(pwq->whead);
+ /*
+ * If it is cleared by POLLFREE, it should be rcu-safe.
+ * If we read NULL we need a barrier paired with
+ * smp_store_release() in ep_poll_callback(), otherwise
+ * we rely on whead->lock.
+ */
+ whead = smp_load_acquire(&pwq->whead);
if (whead)
remove_wait_queue(whead, &pwq->wait);
rcu_read_unlock();
@@ -1006,17 +1011,6 @@ static int ep_poll_callback(wait_queue_t
struct epitem *epi = ep_item_from_wait(wait);
struct eventpoll *ep = epi->ep;

- if ((unsigned long)key & POLLFREE) {
- ep_pwq_from_wait(wait)->whead = NULL;
- /*
- * whead = NULL above can race with ep_remove_wait_queue()
- * which can do another remove_wait_queue() after us, so we
- * can't use __remove_wait_queue(). whead->lock is held by
- * the caller.
- */
- list_del_init(&wait->task_list);
- }
-
spin_lock_irqsave(&ep->lock, flags);

/*
@@ -1081,6 +1075,22 @@ out_unlock:
if (pwake)
ep_poll_safewake(&ep->poll_wait);

+ if ((unsigned long)key & POLLFREE) {
+ /*
+ * If we race with ep_remove_wait_queue() it can miss
+ * ->whead = NULL and do another remove_wait_queue() after
+ * us, so we can't use __remove_wait_queue().
+ */
+ list_del_init(&wait->task_list);
+ /*
+ * ->whead != NULL protects us from the race with ep_free()
+ * or ep_remove(), ep_remove_wait_queue() takes whead->lock
+ * held by the caller. Once we nullify it, nothing protects
+ * ep/epi or even wait.
+ */
+ smp_store_release(&ep_pwq_from_wait(wait)->whead, NULL);
+ }
+
return 1;
}