On Wed, Aug 02, 2023 at 04:16:12PM -0400, Waiman Long wrote:
On 7/12/23 17:11, Kent Overstreet wrote:It's been discussed, Linus had the same thought.
These are used by bcachefs's six locks.Have you considered extending the current rw_semaphore to support a SIX lock
Signed-off-by: Kent Overstreet <kent.overstreet@xxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
---
kernel/locking/osq_lock.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index d5610ad52b..b752ec5cc6 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -203,6 +203,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
return false;
}
+EXPORT_SYMBOL_GPL(osq_lock);
void osq_unlock(struct optimistic_spin_queue *lock)
{
@@ -230,3 +231,4 @@ void osq_unlock(struct optimistic_spin_queue *lock)
if (next)
WRITE_ONCE(next->locked, 1);
}
+EXPORT_SYMBOL_GPL(osq_unlock);
semantics? There are a number of instances in the kernel that a up_read() is
followed by a down_write(). Basically, the code try to upgrade the lock from
read to write. I have been thinking about adding a upgrade_read() API to do
that. However, the concern that I had was that another writer may come in
and make modification before the reader can be upgraded to have exclusive
write access and will make the task to repeat what has been done in the read
lock part. By adding a read with intent to upgrade to write, we can have
that guarantee.
But it'd be a massive change to the rw semaphore code; this "read with
intent" really is a third lock state which needs all the same
lock/trylock/unlock paths, and with the way rw semaphore has separate
entry points for read and write it'd be a _ton_ of new code. It really
touches everything - waitlist handling included.
And six locks have several other features that bcachefs needs, and other
users may also end up wanting, that rw semaphores don't have; the two
main features being a percpu read lock mode and support for an external
cycle detector (which requires exposing lock waitlists, with some
guarantees about how those waitlists are used).
With that said, I would prefer to keep osq_{lock/unlock} for internal use byYeah, I'm aware, but it seems like exposing osq_(lock|unlock) is the
some higher level locking primitives - mutex, rwsem and rt_mutex.
most palatable solution for now. Long term, I'd like to get six locks
promoted to kernel/locking.