Re: [PATCH 11/20] locking/osq: Export osq_(lock|unlock)

From: Waiman Long
Date: Wed Aug 02 2023 - 17:10:09 EST


On 8/2/23 16:44, Kent Overstreet wrote:
On Wed, Aug 02, 2023 at 04:16:12PM -0400, Waiman Long wrote:
On 7/12/23 17:11, Kent Overstreet wrote:
These are used by bcachefs's six locks.

Signed-off-by: Kent Overstreet <kent.overstreet@xxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
---
kernel/locking/osq_lock.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index d5610ad52b..b752ec5cc6 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -203,6 +203,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
return false;
}
+EXPORT_SYMBOL_GPL(osq_lock);
void osq_unlock(struct optimistic_spin_queue *lock)
{
@@ -230,3 +231,4 @@ void osq_unlock(struct optimistic_spin_queue *lock)
if (next)
WRITE_ONCE(next->locked, 1);
}
+EXPORT_SYMBOL_GPL(osq_unlock);
Have you considered extending the current rw_semaphore to support a SIX lock
semantics? There are a number of instances in the kernel that a up_read() is
followed by a down_write(). Basically, the code try to upgrade the lock from
read to write. I have been thinking about adding a upgrade_read() API to do
that. However, the concern that I had was that another writer may come in
and make modification before the reader can be upgraded to have exclusive
write access and will make the task to repeat what has been done in the read
lock part. By adding a read with intent to upgrade to write, we can have
that guarantee.
It's been discussed, Linus had the same thought.

But it'd be a massive change to the rw semaphore code; this "read with
intent" really is a third lock state which needs all the same
lock/trylock/unlock paths, and with the way rw semaphore has separate
entry points for read and write it'd be a _ton_ of new code. It really
touches everything - waitlist handling included.

Yes, it is a major change, but I had done that before and it is certainly doable. There are spare bits in the low byte of rwsem->count that can be used as an intent bit. We also need to add a new rwsem_wake_type for that for waitlist handling.



And six locks have several other features that bcachefs needs, and other
users may also end up wanting, that rw semaphores don't have; the two
main features being a percpu read lock mode and support for an external
cycle detector (which requires exposing lock waitlists, with some
guarantees about how those waitlists are used).

Can you provide more information about those features?


With that said, I would prefer to keep osq_{lock/unlock} for internal use by
some higher level locking primitives - mutex, rwsem and rt_mutex.
Yeah, I'm aware, but it seems like exposing osq_(lock|unlock) is the
most palatable solution for now. Long term, I'd like to get six locks
promoted to kernel/locking.

Your SIX overlaps with rwsem in term of features. So we will have to somehow merge them instead of having 2 APIs with somewhat similar functionality.

Cheers,
Longman