Re: [PATCH v3 2/5] locking/rwsem: Limit # of null owner retries for handoff writer
From: Peter Zijlstra
Date: Tue Oct 25 2022 - 07:22:43 EST
On Mon, Oct 24, 2022 at 11:55:53AM -0400, Waiman Long wrote:
> Looks like, There is still a window for a race.
>
> There is a chance when a reader who came first added it's BIAS and goes to
> slowpath and before it gets added to wait list it got preempted by RT task
> which goes to slowpath as well and being the first waiter gets its hand-off
> bit set and not able to get the lock due to following condition in
> rwsem_try_write_lock()
>
> 630 if (count & RWSEM_LOCK_MASK) { ==> reader has sets its
> bias
> ..
> ...
>
> 634
> 635 new |= RWSEM_FLAG_HANDOFF;
> 636 } else {
> 637 new |= RWSEM_WRITER_LOCKED;
>
>
> ---------------------->----------------------->-------------------------
>
> First reader (1) writer(2) RT task Lock holder(3)
>
> It sets
> RWSEM_READER_BIAS.
> while it is going to
> slowpath(as the lock
> was held by (3)) and
> before it got added
> to the waiters list
> it got preempted
> by (2).
> RT task also takes
> the slowpath and add release the
> itself into waiting list rwsem lock
> and since it is the first clear the
> it is the next one to get owner.
> the lock but it can not
> get the lock as (count &
> RWSEM_LOCK_MASK) is set
> as (1) has added it but
> not able to remove its
> adjustment.
>
> ----------------------
>
> To fix that we either has to disable preemption in down_read() and reenable
> it in rwsem_down_read_slowpath after decrementing the RWSEM_READER_BIAS or
> to limit the number of trylock-spinning attempt like this patch. The latter
> approach seems a bit less messy and I am going to take it back out anyway in
> patch 4. I will put a summary of that special case in the patch description.
Funny, I find the former approach much saner. Disabling preemption
around the whole thing fixes the fundamental problem while spin-limiting
is a band-aid.
Note how rwsem_write_trylock() already does preempt_disable(), having
the read-side do something similar only makes sense.