Re: [PATCH v2 2/3] epoll: ep->wq can be woken up unlocked in certain cases

From: Roman Penyaev
Date: Mon Feb 10 2020 - 14:31:40 EST


On 2020-02-10 19:16, Jason Baron wrote:
On 2/10/20 4:41 AM, Roman Penyaev wrote:
Now ep->lock is responsible for wqueue serialization, thus if ep->lock
is taken on write path, wake_up_locked() can be invoked.

Though, read path is different. Since concurrent cpus can enter the
wake up function it needs to be internally serialized, thus wake_up()
variant is used which implies internal spin lock.

Signed-off-by: Roman Penyaev <rpenyaev@xxxxxxx>
Cc: Max Neunhoeffer <max@xxxxxxxxxxxx>
Cc: Jakub Kicinski <kuba@xxxxxxxxxx>
Cc: Christopher Kohlhoff <chris.kohlhoff@xxxxxxxxxxxx>
Cc: Davidlohr Bueso <dbueso@xxxxxxx>
Cc: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: linux-fsdevel@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
Nothing interesting in v2:
changed the comment a bit

fs/eventpoll.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index eee3c92a9ebf..6e218234bd4a 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1173,7 +1173,7 @@ static inline bool chain_epi_lockless(struct epitem *epi)
* Another thing worth to mention is that ep_poll_callback() can be called
* concurrently for the same @epi from different CPUs if poll table was inited
* with several wait queues entries. Plural wakeup from different CPUs of a
- * single wait queue is serialized by wq.lock, but the case when multiple wait
+ * single wait queue is serialized by ep->lock, but the case when multiple wait
* queues are used should be detected accordingly. This is detected using
* cmpxchg() operation.
*/
@@ -1248,6 +1248,12 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
break;
}
}
+ /*
+ * Since here we have the read lock (ep->lock) taken, plural
+ * wakeup from different CPUs can occur, thus we call wake_up()
+ * variant which implies its own lock on wqueue. All other paths
+ * take write lock.
+ */
wake_up(&ep->wq);
}
if (waitqueue_active(&ep->poll_wait))
@@ -1551,7 +1557,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,

/* Notify waiting tasks that events are available */
if (waitqueue_active(&ep->wq))
- wake_up(&ep->wq);
+ wake_up_locked(&ep->wq);


So I think this will now hit the 'lockdep_assert_held()' in
__wake_up_common()? I agree that its correct, but I think it will
confuse lockdep here...

Argh! True. And I do not see any neat way to shut up lockdep here
(Calling lock_acquire() manually seems not an option for such minor
thing).

Then this optimization is not needed, patch is cancelled.

Thanks for noting that.

--
Roman