Re: Re: [RFC][PATCH 2/5] mips/atomic: Fix loongson_llsc_mb() wreckage

From: Peter Zijlstra
Date: Thu Apr 25 2019 - 03:34:16 EST



A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

On Thu, Apr 25, 2019 at 12:58:50PM +0800, huangpei@xxxxxxxxxxx wrote:
> In my opinion. patch 2/3 is about Loongson's bug, and patch 4/5 is another theme.

Agreed; it's just that looking at the MIPS code to fix 4/5 made me trip
over this stuff.

> Let me explain the bug more specific:
>
> the bug ONLY matters in following situation:
>
> #. more than one cpu (assume cpu A and B) doing ll/sc on same shared
> var V
>
> #. speculative memory access from A cause A erroneously succeed sc
> operation, since the erroneously successful sc operation violate the
> coherence protocol. (here coherence protocol means the rules that CPU
> follow to implement ll/sc right)
>
> #. B succeed sc operation too, but this sc operation is right both
> logically and follow the coherence protocol, and makes A's sc wrong
> logically since only ONE sc operation can succeed.

(I know your coherence protocol is probably more complicated than MESI,
but bear with me)

So A speculatively gets V's line in Exclusive mode, speculates the Lock
flag is still there and completes the Store. This speculative store then
leaks out and violates MESI because there _should_ only be one Exclusive
owner of a line (B).

Something like that?

> If it is not LL/SC but other memory access from B on V, A's ll/sc can
> follow the atomic semantics even if A violate the coherence protocol
> in the same situation.

*shudder*...

C atomic-set

{
atomic_set(v, 1);
}

P1(atomic_t *v)
{
atomic_add_unless(v, 1, 0);
}

P2(atomic_t *v)
{
atomic_set(v, 0);
}

exists
(v=2)

So that one will still work? (that is, v=2 is forbidden)

> In one wordï the bug only affect local cpuâs ll/sc operation, and
> affect MP system.

Because it is a coherence issue, triggered by a reorder. OK.

> PS:
>
> If local_t is only ll/sc manipulated by current CPUï then no need fix it.

It _should_ be CPU local, but this was not at all clear from reading the
original changelog nor the comment with loongson_llsc_mb().