> I cannot resist suggesting that any lock that interacts with
> spin_unlock_wait() must have all relevant acquisitions followed by
> smp_mb__after_unlock_lock().


1. This would expand the purpose of smp_mb__after_unlock_lock(),
right? smp_mb__after_unlock_lock() is for making UNLOCK-LOCK
pair global transitive rather than guaranteeing no operations
can be reorder before the STORE part of LOCK/ACQUIRE.

2. If ARM64 has the same problem as PPC now,
smp_mb__after_unlock_lock() can't help, as it's a no-op on


