Re: [PATCH v6 5/6] locking/rwsem: Enable direct rwsem lock handoff
From: Waiman Long
Date: Mon Jan 23 2023 - 17:10:00 EST
On 1/23/23 12:30, Waiman Long wrote:
I will update the patch description to highlight the points that I
discussed in this email.
I am planning to update the patch description to as follows:
The lock handoff provided in rwsem isn't a true handoff like that in
the mutex. Instead, it is more like a quiescent state where optimistic
spinning and lock stealing are disabled to make it easier for the first
waiter to acquire the lock.
For mutex, lock handoff is done at unlock time as the owner value and
the handoff bit is in the same lock word and can be updated atomically.
That is the not case for rwsem which has a separate count value for
locking and an owner value. The only way to update them in a
quasi-atomic
way is to use the wait_lock for synchronization as the handoff bit can
only be updated while holding the wait_lock. So for rwsem, the new
lock handoff mechanism is done mostly at rwsem_wake() time when the
wait_lock has to be acquired anyway to minimize additional overhead.
Passing the count value at unlock time down to rwsem_wake() to
determine
if handoff should be done is not safe as the waiter that set the
RWSEM_FLAG_HANDOFF bit may have been interrupted out or killed in the
interim. So we need to recheck the count value again after taking the
wait_lock. If there is an active lock, we can't perform the handoff
even if the handoff bit is set at both the unlock and rwsem_wake()
times. It is because there is a slight possibility that the original
waiter that set the handoff bit may have bailed out followed by a read
lock and then the handoff bit is set by another waiter.
It is also likely that the active lock in this case may be a transient
RWSEM_READER_BIAS that will be removed soon. So we have a secondary
handoff done at reader slow path to handle this particular case.
For reader-owned rwsem, the owner value other than the
RWSEM_READER_OWNED
bit is mostly for debugging purpose only. So it is not safe to use
the owner value to confirm a handoff to a reader has happened. On the
other hand, we can do that when handing off to a writer. However, it
is simpler to use the same mechanism to notify a handoff has happened
for both readers and writers. So a new HANDOFF_GRANTED state is added
to enum rwsem_handoff_state to signify that. This new value will be
written to the handoff_state value of the first waiter.
With true lock handoff, there is no need to do a NULL owner spinning
anymore as wakeup will be performed if handoff is successful. So it
is likely that the first waiter won't actually go to sleep even when
schedule() is called in this case.
Please let me know what you think.
Cheers,
Longman