Re: [PATCH v3 3/5] MCS Lock: Barrier corrections

From: Michel Lespinasse
Date: Thu Nov 07 2013 - 15:00:02 EST

Next message: Ard Biesheuvel: "Re: [RFC PATCH 2/4] cpu: advertise CPU features over udev in a generic way"
Previous message: Tim Kryger: "Re: [RESEND PATCH 2/4] i2c: i2c-bcm-kona: Add support for high-speed mode"
In reply to: Paul E. McKenney: "Re: [PATCH v3 3/5] MCS Lock: Barrier corrections"
Next in thread: Tim Chen: "Re: [PATCH v3 3/5] MCS Lock: Barrier corrections"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Nov 7, 2013 at 6:31 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Nov 07, 2013 at 04:50:23AM -0800, Michel Lespinasse wrote:
>> On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> >
>> > On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@xxxxxxxxxx> wrote:
>> >>
>> >> Rather than writing arch-specific locking code, would you agree to
>> >> introduce acquire and release memory operations ?
>> >
>> > Yes, that's probably the right thing to do. What ops do we need? Store with
>> > release, cmpxchg and load with acquire? Anything else?
>>
>> Depends on what lock types we want to implement on top; for MCS we would need:
>> - xchg acquire (common case) and load acquire (for spinning on our
>> locker's wait word)
>> - cmpxchg release (when there is no next locker) and store release
>> (when writing to the next locker's wait word)
>>
>> One downside of the proposal is that using a load acquire for spinning
>> puts the memory barrier within the spin loop. So this model is very
>> intuitive and does not add unnecessary barriers on x86, but it my
>> place the barriers in a suboptimal place for architectures that need
>> them.
>
> OK, I will bite... Why is a barrier in the spinloop suboptimal?

It's probably not a big deal - all I meant to say is that if you were
manually placing barriers, you would probably put one after the loop
instead. I don't deal much with architectures where such barriers are
needed, so I don't know for sure if the difference means much.

> Can't say that I have tried measuring it, but the barrier should not
> normally result in interconnect traffic. Given that the barrier is
> required anyway, it should not affect lock-acquisition latency.

Agree

> So what am I missing here?

I think you read my second email as me trying to shoot down a proposal
- I wasn't, as I really like the acquire/release model and find it
easy to program with, which is why I'm proposing it in the first
place. I just wanted to be upfront about all potential downsides, so
we can consider them and see if they are significant - I don't think
they are, but I'm not the best person to judge that as I mostly just
deal with x86 stuff.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Ard Biesheuvel: "Re: [RFC PATCH 2/4] cpu: advertise CPU features over udev in a generic way"
Previous message: Tim Kryger: "Re: [RESEND PATCH 2/4] i2c: i2c-bcm-kona: Add support for high-speed mode"
In reply to: Paul E. McKenney: "Re: [PATCH v3 3/5] MCS Lock: Barrier corrections"
Next in thread: Tim Chen: "Re: [PATCH v3 3/5] MCS Lock: Barrier corrections"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]