[PATCH] fs/epoll: rework safewake for CONFIG_DEBUG_LOCK_ALLOC

From: Davidlohr Bueso
Date: Mon Jan 06 2020 - 14:45:50 EST

Some comments on:

f6520c52084 (epoll: simplify ep_poll_safewake() for CONFIG_DEBUG_LOCK_ALLOC)

For one this does not play nice with preempt_rt as disabling irq and
then taking a spinlock_t is a no no; the critical region becomes
preemptible. This is particularly important as -rt is being mainlined.

Secondly, it is really ugly compared to what we had before - albeit not
having to deal with all the ep_call_nested() checks, but I doubt this
overhead matters at all with CONFIG_DEBUG_LOCK_ALLOC.

While the current logic avoids nesting by disabling irq during the whole
path, this seems like an overkill under debug. This patch proposes using
this_cpu_inc_return() then taking the irq-safe lock - albeit a possible
irq coming in the middle between these operations and messing up the
subclass. If this is unacceptable, we could always revert the patch,
as this was never a problem in the first place.

Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx>

This is pretty much untested - I wanted to see what is the best way to
address the concerns first.

XXX: I don't think we need get/put_cpu() around the call, this was intended
only for ep_call_nested() tasks_call_list checking afaict.

fs/eventpoll.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 67a395039268..97d036deff3e 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -558,16 +558,17 @@ static void ep_poll_safewake(wait_queue_head_t *wq)
unsigned long flags;
int subclass;

- local_irq_save(flags);
- preempt_disable();
- subclass = __this_cpu_read(wakeup_nest);
- spin_lock_nested(&wq->lock, subclass + 1);
- __this_cpu_inc(wakeup_nest);
+ subclass = this_cpu_inc_return(wakeup_nest);
+ /*
+ * Careful: while unlikely, an irq can occur here and mess up
+ * the subclass. The same goes for right before the dec
+ * operation, but that does not matter.
+ */
+ spin_lock_irqsave_nested(&wq->lock, flags, subclass + 1);
wake_up_locked_poll(wq, POLLIN);
- __this_cpu_dec(wakeup_nest);
- spin_unlock(&wq->lock);
- local_irq_restore(flags);
- preempt_enable();
+ spin_unlock_irqrestore(&wq->lock, flags);
+ this_cpu_dec(wakeup_nest);