Re: [PATCH 1/6] MIPS: Octeon: Implement __smp_store_release()

From: Peter Zijlstra
Date: Wed Jan 27 2021 - 17:59:09 EST


On Wed, Jan 27, 2021 at 09:36:22PM +0100, Alexander A Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxx>
>
> On Octeon smp_mb() translates to SYNC while wmb+rmb translates to SYNCW
> only. This brings around 10% performance on tight uncontended spinlock
> loops.
>
> Refer to commit 500c2e1fdbcc ("MIPS: Optimize spinlocks.") and the link
> below.
>
> On 6-core Octeon machine:
> sysbench --test=mutex --num-threads=64 --memory-scope=local run
>
> w/o patch: 1.60s
> with patch: 1.51s
>
> Link: https://lore.kernel.org/lkml/5644D08D.4080206@xxxxxxxxxxxxxxxxxx/
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxx>
> ---
> arch/mips/include/asm/barrier.h | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index 49ff172..24c3f2c 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -113,6 +113,15 @@ static inline void wmb(void)
> ".set arch=octeon\n\t" \
> "syncw\n\t" \
> ".set pop" : : : "memory")
> +
> +#define __smp_store_release(p, v) \
> +do { \
> + compiletime_assert_atomic_type(*p); \
> + __smp_wmb(); \
> + __smp_rmb(); \
> + WRITE_ONCE(*p, v); \
> +} while (0)

This is wrong in general since smp_rmb() will only provide order between
two loads and smp_store_release() is a store.

If this is correct for all MIPS, this needs a giant comment on exactly
how that smp_rmb() makes sense here.