Re: [PATCH] futex: replace bare barrier() with a READ_ONCE()
From: Darren Hart
Date: Wed Mar 02 2016 - 01:39:23 EST
On Wed, Mar 02, 2016 at 10:31:18AM +0800, Jianyu Zhan wrote:
> On Wed, Mar 2, 2016 at 9:37 AM, Darren Hart <dvhart@xxxxxxxxxxxxx> wrote:
> > read_once will use a *volatile assignment instead of calling barrier()
> > directly for a word size argument.
> >
> > With weak statements like "apparently" (above) and "could be" (from the original
> > post: This patch replaces this bare barrier() with a READ_ONCE(), a weaker form
> > of barrier(), which could be more informative.)" I do not see a compelling
> > reason to change what is notoriously fragile code with respect to barriers.
> >
>
>
> Fair enough. The "apparently" wording is lame without assembly code evidence.
> I do not have such a s390 machine, so I have cc''ed the original
> author incorporating this barrier(),
> hopefully he could help test this.
>
> By "could be more informative" I mean a READ_ONCE has explanatory
> effect by its name, instead of
> a bare barrier() without more inline comment for why.
>
> As I said the retry code logic is effectively the same as
>
> while (lock_ptr = READ_ONCE(q->lock_ptr)) {
> spin_lock(lock_ptr)
> if (unlikely(lock_ptr != q->lock_ptr)) {
> spin_unlock(lock_ptr);
> }
>
> ...
> }
>
> in which case READ_ONCE could perfectly fit, in that it will make
> compiler only read it once in every testing condition,
> which will eliminate the original problem that commit e91467ecd1ef
> addressed, though assembly code proof is needed.
>
> Actually, the quotation I argued in previous mail could be used again
> here, from memory-barriers.txt:
>
> (*) The compiler is within its rights to merge successive loads from
> the same variable. Such merging can cause the compiler to "optimize"
> the following code:
>
> while (tmp = a)
> do_something_with(tmp);
>
> into the following code, which, although in some sense legitimate
> for single-threaded code, is almost certainly not what the developer
> intended:
>
> if (tmp = a)
> for (;;)
> do_something_with(tmp);
>
> Use READ_ONCE() to prevent the compiler from doing this to you:
>
> while (tmp = READ_ONCE(a))
> do_something_with(tmp);
>
>
> I might be wrong, so I have cc'ed Paul, and Peter, I wish they give comment :-)
I believe you are correct with respect to the retry and while condition being an
appropriate place for the application of READ_ONCE. The question is why is this
preferred to the existing barrier()? I suggest:
While barrier() is a fairly brute force general application of a compiler
barrier, READ_ONCE() is very specific, it targets only operations dealing with
the specified variable. As such, its application both clearly identifies the
volatile variable and frees the compiler to make optimizations a more general
barrier would forbid.
>
>
> > As for #2...
> >
> >> 2. For the second problem I memtioned, yes, it is theoretical, and
> >> it is also due to q->lock_ptr can change between
> >> the test of nullness of q->lock_ptr and the lock on q->lock_ptr.
> >>
> >> the code is
> >>
> >> retry:
> >> lock_ptr = q->lock_ptr;
> >> if (lock_ptr != 0) {
> >> spin_lock(lock_ptr)
> >> if (unlikely(lock_ptr != q->lock_ptr)) {
> >> spin_unlock(lock_ptr);
> >> goto retry;
> >> }
> >> ...
> >> }
> >>
> >> which is effectively the same as :
> >>
> >> while (lock_ptr = q->lock_ptr) {
> >> spin_lock(lock_ptr)
> >> if (unlikely(lock_ptr != q->lock_ptr)) {
> >> spin_unlock(lock_ptr);
> >> goto retry;
> >> }
> >> ...
> >> }
> >>
> >> This might cause the compiler load the q->lock_ptr once and use it
> >> many times, quoted from
Which is already covered by the barrier() in place today as a more general
compiler barrier.
Your argument is then simply that READ_ONCE is a more specific solution to the
problem.
> >> memory-barriers.txt:
> >>
> >>
> >> (*) The compiler is within its rights to reload a variable, for example,
> >> in cases where high register pressure prevents the compiler from
> >> keeping all data of interest in registers. The compiler might
> >> therefore optimize the variable 'tmp' out of our previous example:
> >>
> >> while (tmp = a)
> >> do_something_with(tmp);
> >>
> >> This could result in the following code, which is perfectly safe in
> >> single-threaded code, but can be fatal in concurrent code:
> >>
> >> while (a)
> >> do_something_with(a);
> >>
> >> For example, the optimized version of this code could result in
> >> passing a zero to do_something_with() in the case where the variable
> >> a was modified by some other CPU between the "while" statement and
> >> the call to do_something_with().
> >>
> >> Again, use READ_ONCE() to prevent the compiler from doing this:
> >>
> >> while (tmp = READ_ONCE(a))
> >> do_something_with(tmp);
> >>
> >
> > OK, so this is really the meat of the argument for the patch. You are looking to
> > add a compiler barrier instead of a CPU memory barrier. This should be what your
I was incorrect in my statement here. barrier() is a general compiler barrier and
READ_ONCE is a targeted compiler barrier (impacting only the operations dealing
with the specified variable). This helps the argument considerably.
> > commit message is focused on and it should provide compelling evidence to
> > justify risking the change.
> >
> > Compelling evidence for a compiler barrier would be a disassembly of the code
> > block before and after, demonstrating the compiler generating broken code and the
> > compiler generating correct code.
> >
> > In addition to this, however, I would want to see a strong convincing argument
> > that the READ_ONCE volatile cast is sufficient to cover the original case which
> > motivated the addition of the barrier() (not "apparently" and "could be").
>
> As for #2, this is pure theoretical induction from this quotation, I
> do have no convincing argument
> and again I would like Paul to help clarify this.
>
>
> Thanks very much!
>
>
> Regards,
> Jianyu Zhan
>
--
Darren Hart
Intel Open Source Technology Center