[PATCH v2 0/4] use rwlock in order to reduce ep_poll_callback() contention

From: Roman Penyaev
Date: Thu Jan 03 2019 - 10:01:15 EST

The last patch targets the contention problem in ep_poll_callback(), which
can be very well reproduced by generating events (write to pipe or eventfd)
from many threads, while consumer thread does polling.

The following are some microbenchmark results based on the test [1] which
starts threads which generate N events each. The test ends when all events
are successfully fetched by the poller thread:


threads events/ms run-time ms
8 6402 12495
16 7045 22709
32 7395 43268

rwlock + xchg

threads events/ms run-time ms
8 10038 7969
16 12178 13138
32 13223 24199

According to the results bandwidth of delivered events is significantly
increased, thus execution time is reduced.

This series is based on linux-next/akpm.

o I was wrong saying that ep_poll_callback() can't be called
concurrently for the same epi: several wait queues can be
attached to the single epoll item, thus several event sources
can signal in parallel. To cover this case lockless element
addition has to detect that the same @epi is not yet in the
list. This is done by extra cmpxchg() operation.

o unify awaking of wakeup source calling ep_pm_stay_awake_rcu(epi)
in all the cases from ep_poll_callback() path.

o more explicit comments

[1] https://github.com/rouming/test-tools/blob/master/stress-epoll.c

Roman Penyaev (4):
epoll: make sure all elements in ready list are in FIFO order
epoll: loosen irq safety in ep_poll_callback()
epoll: unify awaking of wakeup source on ep_poll_callback() path
epoll: use rwlock in order to reduce ep_poll_callback() contention

fs/eventpoll.c | 178 ++++++++++++++++++++++++++++++++++++-------------
1 file changed, 133 insertions(+), 45 deletions(-)

Signed-off-by: Roman Penyaev <rpenyaev@xxxxxxx>
Cc: Davidlohr Bueso <dbueso@xxxxxxx>
Cc: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: linux-fsdevel@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx