Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

From: Paul E. McKenney
Date: Tue May 17 2016 - 15:46:15 EST


On Tue, May 17, 2016 at 12:15:20PM -0700, Peter Hurley wrote:
> On 05/16/2016 10:50 AM, Peter Zijlstra wrote:
> > On Mon, May 16, 2016 at 07:17:42AM -0700, Peter Hurley wrote:
> >
> >>>> Correct; which is why we should always use {READ,WRITE}_ONCE() for
> >>>> anything that is used locklessly.
> >>>
> >>> Completely agreed. Improve readability of code by flagging lockless
> >>> shared-memory accesses, help checkers better find bugs, and prevent the
> >>> occasional compiler mischief!
> >>
> >> I think this would be a mistake for 3 reasons:
> >>
> >> 1. If READ_ONCE()/WRITE_ONCE() is necessary to prevent load/store tearing
> >> of any normally-atomic type (char/int/long/void*), then _every_ access
> >> would require READ_ONCE()/WRITE_ONCE(), thus eliminating any possibility
> >> of compiler optimization (eg. eliding redundant loads) where it would
> >> otherwise be possible.
> >
> > Should not really be a problem I think; you already have to be very
> > careful when doing lockless stuff.
>
> I think you may not understand my point here.
>
> Consider normal list processing.
>
> Paul added a READ_ONCE() load of head->next to list_empty().
> That's because list_empty() is being used locklessly in lots of places,
> not just for RCU lists.
>
> Ok, so the load won't be torn. But the store to ->next can, so Paul
> added WRITE_ONCE() to some but not all those list primitives too, even
> though those aren't being used locklessly.
>
> Note that the compiler _cannot_ optimize those stores even though
> they're being used with locks held; IOW, exactly the situation that
> volatile-consider-harmful rails against.
>
> Similarly for the load in list_empty() even if it's used with locks held.
>
> As Paul pointed out in his reply, there is no way to address this right now.

Actually, if you show a case where this makes a visible system-wide
difference, you could create a set of primitives for #1 below. Have
a compiler version check, and if it is an old compiler, map them to
READ_ONCE() and WRITE_ONCE(), otherwise as follows, though preferably
with better names:

#define READ_NOTEAR(x) __atomic_load_n(&(x), __ATOMIC_RELAXED)
#define WRITE_NOTEAR(x, v) __atomic_store_n(&(x), (v), __ATOMIC_RELAXED)

The ambiguity between "no tear" and "not ear" should help motivate a
better choice of name.

My guess is that the best justification for adding these will come
from the TINY effort. My addition of ATOMIC_READ() and ATOMIC_WRITE()
did increase the size. But perhaps there is a benchmark that can see
the difference in performance, response time, or what have you.

The theoreticians won't like it much, as this will introduce the
theoretical possibility of "out of thin air" values, but then again,
there are other theoreticians working on this problem.

> What's really needed is separate, not overloaded semantics:
>
> 1. "*If* you load or store this value, do it with single memory access, but
> feel free to optimize (elide load/defer store/hoist load/whatever)."
>
> 2. "No, seriously, load/store this value regardless of what you think you
> know." == READ_ONCE/WRITE_ONCE
>
>
> If we start adding READ_ONCE/WRITE_ONCE to every lockless load/store,
> worse code will be generated even though the vast majority of current use
> is safe.

There will indeed be some degradation in theory. I would be surprised
if it is visible at the system level, aside from tiny kernels. That said,
I have been surprised before.

> >> 2. Makes a mess of otherwise readable code.
> >
> > We have to disagree here; I think code with {READ,WRITE}_ONCE() is more
> > readable, as their presence is a clear indication something special is
> > up.
>
> I think you and Paul may wildly underestimate the scale of lockless
> use in the kernel not currently annotated by READ_ONCE()/WRITE_ONCE().

Believe me, I know that READ_ONCE() and WRITE_ONCE() usage would need
to increase by -at- -least- an order of magnitude in order to cover
the cases. Many of the might be harmless with current compilers, but
compilers have had a habit of getting more aggressive over time.

> Even in primitives designed for lockless concurrency like
> rwsem and seqlock and sched/core, much less higher order code like
> ipc/msg.c and vfs.
>
> The majority of this lockless use is safe if the assumption
> that loads/stores are performed atomically for char/short/int/long/void*
> hold; in other words usage #1 above.
>
>
> >> 3. Error-prone; ie., easy to overlook in review.
> >
> > lockless stuff is error prone by nature; what's your point?
>
> That lockless use is very common, even outside core functionality, and
> needlessly splattering READ_ONCE/WRITE_ONCE when the existing code
> generation is already safe, will not make it better. And that it will be
> no safer by adding READ_ONCE/WRITE_ONCE because plenty of accesses will
> be overlooked, like the stores to sem->owner are now.

I expect mechanical aids will help locate the overlooked accesses.

> >> There is no practical difference between _always_ using READ_ONCE()/WRITE_ONCE()
> >> (to prevent tearing) and declaring the field volatile.
> >
> > There is, declaring the field volatile doesn't make it stand out in the
> > code while reading at all; it also doesn't allow you to omit the
> > volatile qualifier in places where it doesn't matter.
> >
> > The whole; variables aren't volatile, accesses are, thing is still very
> > much in effect.
>
> I'm not really seriously arguing for volatile declarations; I'm pointing
> out that READ_ONCE()/WRITE_ONCE() has overloaded meaning that is
> counter-productive.
>
> For example, you point out that it doesn't allow you to omit the
> volatile qualifier but yet we're adding it to underlying primitives
> that may or may not be used locklessly.
>
> Guaranteed list_empty() and list_add() are used way more than INIT_LIST_HEAD().

No argument on usage frequency, just a question on visible effects.

> >> So we've come full-circle from volatile-considered-harmful.
> >
> > I don't think so; the cases that document talks about are still very
> > much relevant and volatile should not be used for that.
>
> The example below is clearly wrong now.
>
> "Another situation where one might be tempted to use volatile is
> when the processor is busy-waiting on the value of a variable. The right
> way to perform a busy wait is:
>
> while (my_variable != what_i_want)
> cpu_relax();
>
> The cpu_relax() call can lower CPU power consumption or yield to a
> hyperthreaded twin processor; it also happens to serve as a compiler
> barrier, so, once again, volatile is unnecessary. Of course, busy-
> waiting is generally an anti-social act to begin with."
>
>
> The fact that cpu_relax() is a barrier is immaterial; the load of my_variable
> must now be READ_ONCE() to prevent load-tearing.
>
> Likewise, whatever is writing to 'my_variable' must now be WRITE_ONCE().

Yow! That document has not been changed since 2010. I bet that there
is much else in it that needs a bit of help. ;-)

Thanx, Paul