Re: [PATCH 4/4] locking: Introduce smp_cond_acquire()

From: Paul E. McKenney
Date: Wed Nov 04 2015 - 08:04:39 EST


On Tue, Nov 03, 2015 at 08:43:22PM -0800, Linus Torvalds wrote:
> On Tue, Nov 3, 2015 at 7:57 PM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Thank you, and yes, it clearly states that read-to-write dependencies
> > are ordered.
>
> Well, I wouldn't say that it's exactly "clear".
>
> The fact that they explicitly say "Note that the DP relation does not
> directly impose a BEFORE (â) ordering between accesses u and v" makes
> it all clear as mud.
>
> They *then* go on to talk about how the DP relationship *together*
> with the odd "is source of" ordering (which in turn is defined in
> terms of BEFORE ordering) cannot have cycles.
>
> I have no idea why they do it that way, but the reason seems to be
> that they wanted to make "BEFORE" be purely about barriers and
> accesses, and make the other orderings be described separately. So the
> "BEFORE" ordering is used to define how memory must act, which is then
> used as a basis for that storage definition and the "is source of"
> thing.
>
> But none of that seems to make much sense to a *user*.
>
> The fact that they seem to equate "BEFORE" with "Processor Issue
> Constraints" also makes me think that the whole logic was written by a
> CPU designer, and part of why they document it that way is that the
> CPU designer literally thought of "can I issue this access" as being
> very different from "is there some inherent ordering that just results
> from issues outside of my design".
>
> I really don't know. That whole series of memory ordering rules makes
> my head hurt.

The honest answer is that for Alpha, I don't know either. But if your
head isn't hurting enough yet, feel free to read on...



My guess, based loosely on ARM and PowerPC, is that memory barriers
provide global ordering but that the load-to-store dependency ordering is
strictly local. Which as you say does not necessarily make much sense
to a user. One guess is that the following would never trigger the
BUG_ON() given x, y, and z initially zero (and ignoring the possibility
of compiler mischief):

CPU 0 CPU 1 CPU 2
r1 = x; r2 = y; r3 = z;
if (r1) if (r2) if (r3)
y = 1; z = 1; x = 1;
BUG_ON(r1 == 1 && r2 == 1 && r3 == 1); /* after the dust settles */

The write-to-read relationships prevent misordering. The copy of the
Alpha manual I downloaded hints that this BUG_ON() could not fire.

However, the following might well trigger:

CPU 0 CPU 1 CPU 2
x = 1; r1 = x; r2 = y;
if (r1) if (r2)
y = 1; x = 2;
BUG_ON(r1 == 1 && r2 == 1 && x == 1); /* after the dust settles */

The dependency ordering orders each CPU individually, but might not force
CPU 0's write to reach CPU 1 and CPU 2 at the same time. So the BUG_ON()
case would happen if CPU 0's write to x reach CPU 1 before it reached
CPU 2, in which case the x==2 value might not be seen outside of CPU 2,
so that everyone agrees on the order of values taken on by x.

And this could be prevented by enforcing global (rather than local)
ordering by placing a memory barrier in CPU 1:

CPU 0 CPU 1 CPU 2
x = 1; r1 = x; r2 = y;
smp_mb(); if (r2)
if (r1) x = 2;
y = 1;
BUG_ON(r1 == 1 && r2 == 1 && x == 1); /* after the dust settles */

But the reference manual doesn't have this sort of litmus test, so who
knows? CCing the Alpha maintainers in case they know.

> But I do think the only thing that matters in the end is that they do
> have that DP relationship between reads and subsequently dependent
> writes, but basically not for *any* other relationship.
>
> So on alpha read-vs-read, write-vs-later-read, and write-vs-write all
> have to have memory barriers, unless the accesses physically overlap.

Agreed.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/