Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

From: Peter Zijlstra
Date: Thu Nov 29 2018 - 08:45:44 EST


On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
>
> +Cc davidlohr and waiman

> Urgh; so the case where the cmpxchg() fails because it already has a
> wakeup in progress, which then 'violates' our expectation of when the
> wakeup happens.
>
> Yes, I think this is real, and worse, I think we need to go audit all
> wake_q_add() users and document this behaviour.
>
> In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
> but I don't think we can easily fix that.

See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
that introduces the exact same bug.

Something like the below perhaps, altough this pattern seems to want a
wake_a_add() variant that already assumes get_task_struct().

diff --git a/kernel/futex.c b/kernel/futex.c
index f423f9b6577e..d14971f6ed3d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1387,11 +1387,7 @@ static void mark_wake_futex(struct wake_q_head *wake_q, struct futex_q *q)
if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
return;

- /*
- * Queue the task for later wakeup for after we've released
- * the hb->lock. wake_q_add() grabs reference to p.
- */
- wake_q_add(wake_q, p);
+ get_task_struct(p);
__unqueue_futex(q);
/*
* The waiting task can free the futex_q as soon as q->lock_ptr = NULL
@@ -1401,6 +1397,13 @@ static void mark_wake_futex(struct wake_q_head *wake_q, struct futex_q *q)
* plist_del in __unqueue_futex().
*/
smp_store_release(&q->lock_ptr, NULL);
+
+ /*
+ * Queue the task for later wakeup for after we've released
+ * the hb->lock. wake_q_add() grabs reference to p.
+ */
+ wake_q_add(wake_q, p);
+ put_task_struct(p);
}

/*