Re: [PATCH] Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic()

From: Paul E. McKenney
Date: Tue Apr 23 2019 - 09:30:22 EST


On Tue, Apr 23, 2019 at 02:32:09PM +0200, Peter Zijlstra wrote:
> On Sat, Apr 20, 2019 at 01:54:40AM -0700, Paul E. McKenney wrote:
> > And atomic_set(): set_preempt_state(). This fails
> > on x86, s390, and TSO friends, does it not? Or is
> > this ARM-only? Still, why not just smp_mb() before and
> > after? Same issue in __kernfs_new_node(), bio_cnt_set(),
> > sbitmap_queue_update_wake_batch(),
> >
> > Ditto for atomic64_set() in __ceph_dir_set_complete().
> >
> > Ditto for atomic_read() in rvt_qp_is_avail(). This function
> > has a couple of other oddly placed smp_mb__before_atomic().
>
> That are just straight up bugs. The atomic_t.txt file clearly specifies
> the barriers only apply to RmW ops and both _set() and _read() are
> specified to not be a RmW.

Agreed. The "Ditto" covers my atomic_set() consternation. ;-)

> > And atomic_cmpxchg(): msc_buffer_alloc(). This instance
> > of smp_mb__before_atomic() can be removed unless I am missing
> > something subtle. Ditto for kvm_vcpu_exiting_guest_mode(),
> > pv_kick_node(), __sbq_wake_up(),
>
> Note that pv_kick_node() uses cmpxchg_relaxed(), which does not
> otherwise imply barriers.

Good point, my eyes must have been going funny.

> > And lock acquisition??? acm_read_bulk_callback().
>
> I think it goes with the set_bit() earlier, but what do I know.

Quite possibly! In that case it should be smp_mb__after_atomic(),
and it would be nice if it immediately followed the set_bit().

> > In nfnl_acct_fill_info(), a smp_mb__before_atomic() after
> > a atomic64_xchg()??? Also before a clear_bit(), but the
> > clear_bit() is inside an "if".
>
> Since it is _before, I'm thinking the pairing was intended with the
> clear_bit(), and yes, then I would expect the smp_mb__before_atomic() to
> be part of that same branch.

It is quite possible that this one is a leftover, where the atomic
operation was removed but the smp_mb__{before,after}_atomic() lived on.
I had one of those in RCU, which now has a patch in -rcu.

> > There are a few cases that would see added overhead. For example,
> > svc_get_next_xprt() has the following:
> >
> > smp_mb__before_atomic();
> > clear_bit(SP_CONGESTED, &pool->sp_flags);
> > clear_bit(RQ_BUSY, &rqstp->rq_flags);
> > smp_mb__after_atomic();
> >
> > And xs_sock_reset_connection_flags() has this:
> >
> > smp_mb__before_atomic();
> > clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> > clear_bit(XPRT_CLOSING, &xprt->state);
> > xs_sock_reset_state_flags(xprt); /* Also a clear_bit(). */
> > smp_mb__after_atomic();
> >
> > Yeah, there are more than a few misuses, aren't there? :-/
> > A coccinelle script seems in order. In 0day test robot.
>
> If we can get it to flag the right patterns, then yes that might be
> useful regardless of the issue at hand, people seem to get this one
> wrong a lot.

To be fair, the odd-looking ones are maybe 5% of the total. Still too
many wrong, but the vast majority look OK.

> > But there are a number of helper functions whose purpose
> > seems to be to wrap an atomic in smp_mb__before_atomic() and
> > smp_mb__after_atomic(), so some of the atomic_xxx_mb() functions
> > might be a good idea just for improved readability.
>
> Are there really sites where _mb() makes sense? The above is just a lot
> of buggy code.

There are a great many that look like this:

smp_mb__before_atomic();
clear_bit(NFSD4_CLIENT_UPCALL_LOCK, &clp->cl_flags);
smp_mb__after_atomic();

Replacing these three lines with this would not be a bad thing:

clear_bit_mb(NFSD4_CLIENT_UPCALL_LOCK, &clp->cl_flags);

Thanx, Paul