Re: [PATCH v3 2/5] locking/rwsem: Limit # of null owner retries for handoff writer

From: Peter Zijlstra
Date: Tue Oct 25 2022 - 07:22:43 EST


On Mon, Oct 24, 2022 at 11:55:53AM -0400, Waiman Long wrote:

> Looks like, There is still a window for a race.
>
> There is a chance when a reader who came first added it's BIAS and goes to
> slowpath and before it gets added to wait list it got preempted by RT task
> which  goes to slowpath as well and being the first waiter gets its hand-off
> bit set and not able to get the lock due to following condition in
> rwsem_try_write_lock()
>
>  630                 if (count & RWSEM_LOCK_MASK) {  ==> reader has sets its
> bias
> ..
> ...
>
>  634
>  635                         new |= RWSEM_FLAG_HANDOFF;
>  636                 } else {
>  637                         new |= RWSEM_WRITER_LOCKED;
>
>
> ---------------------->----------------------->-------------------------
>
> First reader (1)          writer(2) RT task             Lock holder(3)
>
> It sets
> RWSEM_READER_BIAS.
> while it is going to
> slowpath(as the lock
> was held by (3)) and
> before it got added
> to the waiters list
> it got preempted
> by (2).
>                         RT task also takes
>                         the slowpath and add              release the
>                         itself into waiting list          rwsem lock
>             and since it is the first         clear the
>                         it is the next one to get         owner.
>                         the lock but it can not
>                         get the lock as (count &
>                         RWSEM_LOCK_MASK) is set
>                         as (1) has added it but
>                         not able to remove its
>             adjustment.
>
> ----------------------
>
> To fix that we either has to disable preemption in down_read() and reenable
> it in rwsem_down_read_slowpath after decrementing the RWSEM_READER_BIAS or
> to limit the number of trylock-spinning attempt like this patch. The latter
> approach seems a bit less messy and I am going to take it back out anyway in
> patch 4. I will put a summary of that special case in the patch description.

Funny, I find the former approach much saner. Disabling preemption
around the whole thing fixes the fundamental problem while spin-limiting
is a band-aid.

Note how rwsem_write_trylock() already does preempt_disable(), having
the read-side do something similar only makes sense.