Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

From: Yongji Xie
Date: Thu Nov 29 2018 - 09:02:20 EST


On Thu, 29 Nov 2018 at 21:45, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
> >
> > +Cc davidlohr and waiman
>
> > Urgh; so the case where the cmpxchg() fails because it already has a
> > wakeup in progress, which then 'violates' our expectation of when the
> > wakeup happens.
> >
> > Yes, I think this is real, and worse, I think we need to go audit all
> > wake_q_add() users and document this behaviour.
> >
> > In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
> > but I don't think we can easily fix that.
>
> See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
> that introduces the exact same bug.
>

Hmm...Yes, even the thread may be in futex's wake_q and lead to rwsem's wakeup
missing.

Seems like fix this problem casy by case and document the behaviour is easier
than delay the actual wakeup to the last wake_up_q()...

Thanks,
Yongji

> Something like the below perhaps, altough this pattern seems to want a
> wake_a_add() variant that already assumes get_task_struct().
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index f423f9b6577e..d14971f6ed3d 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1387,11 +1387,7 @@ static void mark_wake_futex(struct wake_q_head *wake_q, struct futex_q *q)
> if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
> return;
>
> - /*
> - * Queue the task for later wakeup for after we've released
> - * the hb->lock. wake_q_add() grabs reference to p.
> - */
> - wake_q_add(wake_q, p);
> + get_task_struct(p);
> __unqueue_futex(q);
> /*
> * The waiting task can free the futex_q as soon as q->lock_ptr = NULL
> @@ -1401,6 +1397,13 @@ static void mark_wake_futex(struct wake_q_head *wake_q, struct futex_q *q)
> * plist_del in __unqueue_futex().
> */
> smp_store_release(&q->lock_ptr, NULL);
> +
> + /*
> + * Queue the task for later wakeup for after we've released
> + * the hb->lock. wake_q_add() grabs reference to p.
> + */
> + wake_q_add(wake_q, p);
> + put_task_struct(p);
> }
>
> /*