Hi Waiman,
On Fri, Jun 19, 2015 at 04:50:02PM +0100, Waiman Long wrote:
The current cmpxchg() loop in setting the _QW_WAITING flag for writers[...]
in queue_write_lock_slowpath() will contend with incoming readers
causing possibly extra cmpxchg() operations that are wasteful. This
patch changes the code to do a byte cmpxchg() to eliminate contention
with new readers.
diff --git a/arch/x86/include/asm/qrwlock.h b/arch/x86/include/asm/qrwlock.hI reckon you could actually use this in the asm-generic header and remove
index a8810bf..5678b0a 100644
--- a/arch/x86/include/asm/qrwlock.h
+++ b/arch/x86/include/asm/qrwlock.h
@@ -7,8 +7,7 @@
#define queued_write_unlock queued_write_unlock
static inline void queued_write_unlock(struct qrwlock *lock)
{
- barrier();
- ACCESS_ONCE(*(u8 *)&lock->cnts) = 0;
+ smp_store_release(&lock->wmode, 0);
}
#endif
the x86 arch version altogether. Most architectures support single-copy
atomic byte access and those that don't (alpha?) can just not use qrwlock
(or override write_unlock with atomic_sub).
I already have a patch making this change, so I'm happy either way.