Re: Prototype patch for Linux-kernel memory model
From: afzal mohammed
Date: Tue Dec 19 2017 - 03:37:14 EST
Hi,
A trivial & late (sorry) comment,
On Wed, Nov 15, 2017 at 08:37:49AM -0800, Paul E. McKenney wrote:
> +THE HAPPENS-BEFORE RELATION: hb
> +-------------------------------
> +Less trivial examples of prop all involve fences. Unlike the simple
> +examples above, they can require that some instructions are executed
> +out of program order. This next one should look familiar:
> +
> + int buf = 0, flag = 0;
> +
> + P0()
> + {
> + WRITE_ONCE(buf, 1);
> + smp_wmb();
> + WRITE_ONCE(flag, 1);
> + }
> +
> + P1()
> + {
> + int r1;
> + int r2;
> +
> + r1 = READ_ONCE(flag);
> + r2 = READ_ONCE(buf);
> + }
> +
> +This is the MP pattern again, with an smp_wmb() fence between the two
> +stores. If r1 = 1 and r2 = 0 at the end then there is a prop link
> +from P1's second load to its first (backwards!). The reason is
> +similar to the previous examples: The value P1 loads from buf gets
> +overwritten by P1's store to buf,
P0's store to buf
afzal
> the fence guarantees that the store
> +to buf will propagate to P1 before the store to flag does, and the
> +store to flag propagates to P1 before P1 reads flag.
> +
> +The prop link says that in order to obtain the r1 = 1, r2 = 0 result,
> +P1 must execute its second load before the first. Indeed, if the load
> +from flag were executed first, then the buf = 1 store would already
> +have propagated to P1 by the time P1's load from buf executed, so r2
> +would have been 1 at the end, not 0. (The reasoning holds even for
> +Alpha, although the details are more complicated and we will not go
> +into them.)
> +
> +But what if we put an smp_rmb() fence between P1's loads? The fence
> +would force the two loads to be executed in program order, and it
> +would generate a cycle in the hb relation: The fence would create a ppo
> +link (hence an hb link) from the first load to the second, and the
> +prop relation would give an hb link from the second load to the first.
> +Since an instruction can't execute before itself, we are forced to
> +conclude that if an smp_rmb() fence is added, the r1 = 1, r2 = 0
> +outcome is impossible -- as it should be.