Re: [RFC][PATCH] mips: Fix arch_spin_unlock()

From: Måns Rullgård
Date: Thu Nov 12 2015 - 13:13:53 EST

David Daney <ddaney@xxxxxxxxxxxxxxxxxx> writes:

> On 11/12/2015 04:31 AM, Peter Zijlstra wrote:
>> Hi
>> I think the MIPS arch_spin_unlock() is borken.
>> spin_unlock() must have RELEASE semantics, these require that no LOADs
>> nor STOREs leak out from the critical section.
>> From what I know MIPS has a relaxed memory model which allows reads to
>> pass stores, and as implemented arch_spin_unlock() only issues a wmb
>> which doesn't order prior reads vs later stores.
>> Therefore upgrade the wmb() to smp_mb().
>> (Also, why the unconditional wmb, as opposed to smp_wmb() ?)
> asm/spinlock.h is only used for !CONFIG_SMP. So, smp_wmb() would
> imply that special handling for non-SMP is needed, when this is
> already only used for the SMP build case.
>> Maybe-Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>> ---
>> diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinlock.h
>> index 40196bebe849..b2ca13f06152 100644
>> --- a/arch/mips/include/asm/spinlock.h
>> +++ b/arch/mips/include/asm/spinlock.h
>> @@ -140,7 +140,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
>> static inline void arch_spin_unlock(arch_spinlock_t *lock)
>> {
>> unsigned int serving_now = lock->h.serving_now + 1;
>> - wmb();
>> + smp_mb();
> That is too heavy.
> It implies a full MIPS "SYNC" operation which stalls execution until
> all previous writes are committed and globally visible.
> We really want just release semantics, and there is no standard named
> primitive that gives us that.
> For CONFIG_CPU_CAVIUM_OCTEON the proper thing would be:
> smp_wmb();
> smp_rmb();
> Which expands to exactly the same thing as wmb() because smp_rmb()
> expands to nothing.
> For CPUs that have out-of-order loads, smp_rmb() should expand to
> something lighter weight than "SYNC"
> Certainly we can load up the code with "SYNC" all over the place, but
> it will kill performance on SMP systems. So, my vote would be to make
> it as light weight as possible, but no lighter. That will mean
> inventing the proper barrier primitives.

It seems to me that the proper barrier here is a "SYNC 18" aka
SYNC_RELEASE instruction, at least on CPUs that implement that variant.

Måns Rullgård
