Re: [RFC PATCH 3/5] asm-generic/bitops/atomic.h: Rewrite using atomic_fetch_*
From: Peter Zijlstra
Date: Fri Feb 16 2018 - 05:21:10 EST
On Thu, Feb 15, 2018 at 06:20:49PM +0000, Will Deacon wrote:
> On Thu, Feb 15, 2018 at 06:08:47PM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 15, 2018 at 03:29:33PM +0000, Will Deacon wrote:
> > > +static inline void __clear_bit_unlock(unsigned int nr,
> > > + volatile unsigned long *p)
> > > +{
> > > + unsigned long old;
> > >
> > > + p += BIT_WORD(nr);
> > > + old = READ_ONCE(*p);
> > > + old &= ~BIT_MASK(nr);
> > > + smp_store_release(p, old);
> >
> > This should be atomic_set_release() I think, for the special case where
> > atomics are implemented with spinlocks, see the 'fun' case in
> > Documentation/atomic_t.txt.
>
> My understanding of __clear_bit_unlock is that there is guaranteed to be
> no concurrent accesses to the same word, so why would it matter whether
> locks are used to implement atomics?
commit f75d48644c56a31731d17fa693c8175328957e1d
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Wed Mar 9 12:40:54 2016 +0100
bitops: Do not default to __clear_bit() for __clear_bit_unlock()
__clear_bit_unlock() is a special little snowflake. While it carries the
non-atomic '__' prefix, it is specifically documented to pair with
test_and_set_bit() and therefore should be 'somewhat' atomic.
Therefore the generic implementation of __clear_bit_unlock() cannot use
the fully non-atomic __clear_bit() as a default.
If an arch is able to do better; is must provide an implementation of
__clear_bit_unlock() itself.
Specifically, this came up as a result of hackbench livelock'ing in
slab_lock() on ARC with SMP + SLUB + !LLSC.
The issue was incorrect pairing of atomic ops.
slab_lock() -> bit_spin_lock() -> test_and_set_bit()
slab_unlock() -> __bit_spin_unlock() -> __clear_bit()
The non serializing __clear_bit() was getting "lost"
80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set
80543b90: or r3,r2,1 <--- (B) other core unlocks right here
80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)
Fixes ARC STAR 9000817404 (and probably more).
Reported-by: Vineet Gupta <Vineet.Gupta1@xxxxxxxxxxxx>
Tested-by: Vineet Gupta <Vineet.Gupta1@xxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Christoph Lameter <cl@xxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Helge Deller <deller@xxxxxx>
Cc: James E.J. Bottomley <jejb@xxxxxxxxxxxxxxxx>
Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Noam Camus <noamc@xxxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Pekka Enberg <penberg@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Link: http://lkml.kernel.org/r/20160309114054.GJ6356@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index c30266e94806..8ef0ccbf8167 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -29,16 +29,16 @@ do { \
* @nr: the bit to set
* @addr: the address to start counting from
*
- * This operation is like clear_bit_unlock, however it is not atomic.
- * It does provide release barrier semantics so it can be used to unlock
- * a bit lock, however it would only be used if no other CPU can modify
- * any bits in the memory until the lock is released (a good example is
- * if the bit lock itself protects access to the other bits in the word).
+ * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
+ * the bits in the word are protected by this lock some archs can use weaker
+ * ops to safely unlock.
+ *
+ * See for example x86's implementation.
*/
#define __clear_bit_unlock(nr, addr) \
do { \
- smp_mb(); \
- __clear_bit(nr, addr); \
+ smp_mb__before_atomic(); \
+ clear_bit(nr, addr); \
} while (0)
#endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */