Re: [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers

From: Benjamin Herrenschmidt
Date: Wed Apr 18 2018 - 02:22:08 EST


On Tue, 2018-04-10 at 13:22 -0400, Waiman Long wrote:
> It was observed occasionally in PowerPC systems that there was reader
> who had not been woken up but that its waiter->task had been cleared.
>
> One probable cause of this missed wakeup may be the fact that the
> waiter->task and the task state have not been properly synchronized as
> the lock release-acquire pair of different locks in the wakeup code path
> does not provide a full memory barrier guarantee. So smp_store_mb()
> is now used to set waiter->task to NULL to provide a proper memory
> barrier for synchronization.
>
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>

That looks right... nothing in either lock or unlock will prevent a
store going past a load.

> ---
> kernel/locking/rwsem-xadd.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> index e795908..b3c588c 100644
> --- a/kernel/locking/rwsem-xadd.c
> +++ b/kernel/locking/rwsem-xadd.c
> @@ -209,6 +209,23 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
> smp_store_release(&waiter->task, NULL);
> }
>
> + /*
> + * To avoid missed wakeup of reader, we need to make sure
> + * that task state and waiter->task are properly synchronized.
> + *
> + * wakeup sleep
> + * ------ -----
> + * __rwsem_mark_wake: rwsem_down_read_failed*:
> + * [S] waiter->task [S] set_current_state(state)
> + * MB MB
> + * try_to_wake_up:
> + * [L] state [L] waiter->task
> + *
> + * For the wakeup path, the original lock release-acquire pair
> + * does not provide enough guarantee of proper synchronization.
> + */
> + smp_mb();
> +
> adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment;
> if (list_empty(&sem->wait_list)) {
> /* hit end of list above */