Re: [PATCH] futex: replace bare barrier() with more lightweight READ_ONCE()

From: Darren Hart
Date: Fri Mar 04 2016 - 17:54:10 EST


On Fri, Mar 04, 2016 at 02:45:11PM -0800, Paul McKenney wrote:
> On Fri, Mar 04, 2016 at 02:38:01PM -0800, Darren Hart wrote:
> > On Fri, Mar 04, 2016 at 01:57:20PM -0800, Paul McKenney wrote:
> > > On Fri, Mar 04, 2016 at 01:05:24PM -0800, Darren Hart wrote:
> > > > On Fri, Mar 04, 2016 at 09:12:31AM +0800, Jianyu Zhan wrote:
> > > > > On Fri, Mar 4, 2016 at 1:05 AM, Darren Hart <dvhart@xxxxxxxxxxxxx> wrote:
> > > > > > I thought I provided a corrected comment block.... maybe I didn't. We have been
> > > > > > working on improving the futex documentation, so we're paying close attention to
> > > > > > terminology as well as grammar. This one needs a couple minor tweaks. I suggest:
> > > > > >
> > > > > > /*
> > > > > > * Use READ_ONCE to forbid the compiler from reloading q->lock_ptr and
> > > > > > * optimizing lock_ptr out of the logic below.
> > > > > > */
> > > > > >
> > > > > > The bit about q->lock_ptr possibly changing is already covered by the large
> > > > > > comment block below the spin_lock(lock_ptr) call.
> > > > >
> > > > > The large comment block is explaining the why the retry logic is required.
> > > > > To achieve this semantic requirement, the READ_ONCE is needed to prevent
> > > > > compiler optimizing it by doing double loads.
> > > > >
> > > > > So I think the comment above should explain this tricky part.
> > > >
> > > > Fair point. Consider:
> > > >
> > > >
> > > > /*
> > > > * q->lock_ptr can change between this read and the following spin_lock.
> > > > * Use READ_ONCE to forbid the compiler from reloading q->lock_ptr and
> > > > * optimizing lock_ptr out of the logic below.
> > > > */
> > > >
> > > > >
> > > > > > /* Use READ_ONCE to forbid the compiler from reloading q->lock_ptr in spin_lock() */
> > > > >
> > > > > And as for preventing from optimizing the lock_ptr out of the retry
> > > > > code block, I have consult
> > > > > Paul Mckenney, he suggests one more READ_ONCE should be added here:
> > > >
> > > > Let's keep this discussion together so we have a record of the
> > > > justification.
> > > >
> > > > +Paul McKenney
> > > >
> > > > Paul, my understanding was that spin_lock was a CPU memory barrier,
> > > > which in turn is an implicit compiler barrier (aka barrier()), of which
> > > > READ_ONCE is described as a weaker form. Reviewing this, I realize the
> > > > scope of barrier() wasn't clear to me. It seems while barrier() ensures
> > > > ordering, it does not offer the same guarantee regarding reloading that
> > > > READ_ONCE offers. So READ_ONCE is not strictly a weaker form of
> > > > barrier() as I had gathered from a spotty reading of
> > > > memory-barriers.txt, but it also offers guarantees regarding memory
> > > > references that barrier() does not.
> > > >
> > > > Correct?
> > >
> > > If q->lock_ptr is never changed except under that lock, then there is
> > > indeed no reason for the ACCESS_ONCE().
> >
> > The only location where a q->lock_ptr is updated without that lock being held is
> > in queue_lock(). This is safe as the futex_q is not yet queued onto an hb until
> > after the lock is held (so unqueue_me() cannot race with queue_lock()).
> >
> > > So, is q->lock_ptr ever changed while the lock is -not- held? If so,
> > > I suggest that you put an ACCESS_ONCE() there.
> >
> > It is not.
>
> If I followed that correctly, then I agree that you don't need an
> ACCESS_ONCE() in this case.
>

Thanks for offering your time and expertise Paul. It's a major effort every time
I open memory-barriers.txt :-)

--
Darren Hart
Intel Open Source Technology Center