Re: [PATCH] epoll: avoid calling ep_call_nested() from ep_poll_safewake()

From: Davidlohr Bueso
Date: Tue Oct 17 2017 - 11:38:40 EST


On Fri, 13 Oct 2017, Jason Baron wrote:

The ep_poll_safewake() function is used to wakeup potentially nested epoll
file descriptors. The function uses ep_call_nested() to prevent entering
the same wake up queue more than once, and to prevent excessively deep
wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary
since we are already preventing these conditions during EPOLL_CTL_ADD. This
saves extra function calls, and avoids taking a global lock during the
ep_call_nested() calls.

This makes sense.


I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC
case, since ep_call_nested() keeps track of the nesting level, and this is
required by the call to spin_lock_irqsave_nested(). It would be nice to
remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as
well, however its not clear how to simply pass the nesting level through
multiple wake_up() levels without more surgery. In any case, I don't think
CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch, also
apparently fixes a workload at Google that Salman Qazi reported by
completely removing the poll_safewake_ncalls->lock from wakeup paths.

I'm a bit curious about the workload (which uses lots of EPOLL_CTL_ADDs) as
I was tackling the nested epoll scaling issue with loop and readywalk lists
in mind.

Signed-off-by: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Salman Qazi <sqazi@xxxxxxxxxx>

Acked-by: Davidlohr Bueso <dbueso@xxxxxxx>

---
fs/eventpoll.c | 47 ++++++++++++++++++-----------------------------
1 file changed, 18 insertions(+), 29 deletions(-)

Yay for getting rid of some of the callback hell.

Thanks,
Davidlohr