Re: Reï[PATCH 4.9 81/96] MIPS: Loongson: Introduce and use loongson_llsc_mb()

From: Greg Kroah-Hartman
Date: Wed Mar 13 2019 - 16:58:08 EST


On Wed, Mar 13, 2019 at 09:17:15PM +0800, éåæ wrote:
> Hi, GREG,
>
> 4.9 need to modify spinlock.h, please wait my patch.
>
>
>
> ---ååéä---
> åää:"Greg Kroah-Hartman"<gregkh@xxxxxxxxxxxxxxxxxxx>
> åéæé:2019å3æ13æ(ææä) åæ1:10
> æää:"linux-kernel"<linux-kernel@xxxxxxxxxxxxxxx>;
> äé:[PATCH 4.9 81/96] MIPS: Loongson: Introduce and use loongson_llsc_mb()
> 4.9-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> [ Upstream commit e02e07e3127d8aec1f4bcdfb2fc52a2d99b4859e ]
>
> On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and
> lld/scd is very weak ordering. We should add sync instructions "before
> each ll/lld" and "at the branch-target between ll/sc" to workaround.
> Otherwise, this flaw will cause deadlock occasionally (e.g. when doing
> heavy load test with LTP).
>
> Below is the explaination of CPU designer:
>
> "For Loongson 3 family, when a memory access instruction (load, store,
> or prefetch)'executing occurs between the execution of LL and SC, the
> success or failure of SC is not predictable. Although programmer would
> not insert memory access instructions between LL and SC, the memory
> instructions before LL in program-order, may dynamically executed
> between the execution of LL/SC, so a memory fence (SYNC) is needed
> before LL/LLD to avoid this situation.
>
> Since Loongson-3A R2 (3A2000), we have improved our hardware design to
> handle this case. But we later deduce a rarely circumstance that some
> speculatively executed memory instructions due to branch misprediction
> between LL/SC still fall into the above case, so a memory fence (SYNC)
> at branch-target (if its target is not between LL/SC) is needed for
> Loongson 3A1000, 3B1500, 3A2000 and 3A3000.
>
> Our processor is continually evolving and we aim to to remove all these
> workaround-SYNCs around LL/SC for new-come processor."
>
> Here is an example:
>
> Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var,
> this bug cause both ''un by two cpus (in atomic_add) succeed at same
> time(''eturn 1), and the variable is only *added by 1*, sometimes,
> which is wrong and unacceptable(it should be added by 2).
>
> Why disable fix-loongson3-llsc in compiler?
> Because compiler fix will cause problems in kernel'__ex_table section.
>
> This patch fix all the cases in kernel, but:
>
> +. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target
> of 'e'there other cases which smp_mb__before_llsc() and smp_llsc_mb() fix
> the ll and branch-target coincidently such as atomic_sub_if_positive/
> cmpxchg/xchg, just like this one.
>
> +. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch
> edac.h
>
> +. local_ops and cmpxchg_local should not be affected by this bug since
> only the owner can write.
>
> +. mips_atomic_set for syscall.c is deprecated and rarely used, just let
> it go
>
> Signed-off-by: Huacai Chen <chenhc@xxxxxxxxxx>
> Signed-off-by: Huang Pei <huangpei@xxxxxxxxxxx>
> [paul.burton@xxxxxxxx:
> - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add
> a comment describing why it'there.
> - Make loongson_llsc_mb() a no-op when
> CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory
> barrier.
> - Add a comment describing the bug & how loongson_llsc_mb() helps
> in asm/barrier.h.]
> Signed-off-by: Paul Burton <paul.burton@xxxxxxxx>
> Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
> Cc: ambrosehua@xxxxxxxxx
> Cc: Steven J . Hill <Steven.Hill@xxxxxxxxxx>
> Cc: linux-mips@xxxxxxxxxxxxxx
> Cc: Fuxin Zhang <zhangfx@xxxxxxxxxx>
> Cc: Zhangjin Wu <wuzhangjin@xxxxxxxxx>
> Cc: Li Xuefeng <lixuefeng@xxxxxxxxxxx>
> Cc: Xu Chenghua <xuchenghua@xxxxxxxxxxx>
> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> ---
> arch/mips/Kconfig | 15 ++++++++++++++
> arch/mips/include/asm/atomic.h | 6 ++++++
> arch/mips/include/asm/barrier.h | 36 +++++++++++++++++++++++++++++++++
> arch/mips/include/asm/bitops.h | 5 +++++
> arch/mips/include/asm/futex.h | 3 +++
> arch/mips/include/asm/pgtable.h | 2 ++
> arch/mips/loongson64/Platform | 23 +++++++++++++++++++++
> arch/mips/mm/tlbex.c | 10 +++++++++
> 8 files changed, 100 insertions(+)

Ok, I will go drop this from all stable queues now, thanks!

greg k-h