Re: [PATCH v3 3/5] MCS Lock: Barrier corrections

From: Tim Chen
Date: Thu Nov 07 2013 - 16:16:16 EST

On Thu, 2013-11-07 at 11:59 -0800, Michel Lespinasse wrote:
> On Thu, Nov 7, 2013 at 6:31 AM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Thu, Nov 07, 2013 at 04:50:23AM -0800, Michel Lespinasse wrote:
> >> On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds
> >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >> >
> >> > On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@xxxxxxxxxx> wrote:
> >> >>
> >> >> Rather than writing arch-specific locking code, would you agree to
> >> >> introduce acquire and release memory operations ?
> >> >
> >> > Yes, that's probably the right thing to do. What ops do we need? Store with
> >> > release, cmpxchg and load with acquire? Anything else?
> >>
> >> Depends on what lock types we want to implement on top; for MCS we would need:
> >> - xchg acquire (common case) and load acquire (for spinning on our
> >> locker's wait word)
> >> - cmpxchg release (when there is no next locker) and store release
> >> (when writing to the next locker's wait word)
> >>
> >> One downside of the proposal is that using a load acquire for spinning
> >> puts the memory barrier within the spin loop. So this model is very
> >> intuitive and does not add unnecessary barriers on x86, but it my
> >> place the barriers in a suboptimal place for architectures that need
> >> them.
> >
> > OK, I will bite... Why is a barrier in the spinloop suboptimal?
> It's probably not a big deal - all I meant to say is that if you were
> manually placing barriers, you would probably put one after the loop
> instead. I don't deal much with architectures where such barriers are
> needed, so I don't know for sure if the difference means much.

We could do a load acquire at the end of the
spin loop in the lock function and not in the spin loop itself if cost
of barrier within spin loop is a concern.

Michel, are you planning to do an implementation of
load-acquire/store-release functions of various architectures?

Or is the approach of arch specific memory barrier for MCS
an acceptable one before load-acquire and store-release
are available? Are there any technical issues remaining with
the patchset after including including Waiman's arch specific barrier?


> > Can't say that I have tried measuring it, but the barrier should not
> > normally result in interconnect traffic. Given that the barrier is
> > required anyway, it should not affect lock-acquisition latency.
> Agree
> > So what am I missing here?
> I think you read my second email as me trying to shoot down a proposal
> - I wasn't, as I really like the acquire/release model and find it
> easy to program with, which is why I'm proposing it in the first
> place. I just wanted to be upfront about all potential downsides, so
> we can consider them and see if they are significant - I don't think
> they are, but I'm not the best person to judge that as I mostly just
> deal with x86 stuff.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at