Re: [PATCH] Document Linux's memory barriers [try #4]

From: Linus Torvalds
Date: Tue Mar 14 2006 - 20:45:36 EST




On Wed, 15 Mar 2006, David Howells wrote:

> Linus Torvalds <torvalds@xxxxxxxx> wrote:
>
> > That's not that different from doing
> >
> > ptr = read a
> > data = read [ptr]
> >
> > and speculating the result of the first read.
>
> But that would lead to the situation I suggested (q == &b and d == a), not the
> one Paul suggested (q == &b and d == old b) because we'd speculate on the old
> value of the pointer, and so see it before it's updated, and thus still
> pointing to a.

No. If it _speculates_ the old value, and the value has actually changed
when it checks the speculation, it would generally result in a uarch trap,
and re-do of the instruction without speculation.

So for data speculation to make a difference in this case, it would
speculate the _new_ value (hey, doesn't matter _why_ - it could be that a
previous load at a previous time had gotten that value), and then load the
old value off the new pointer, and when the speculation ends up being
checked, it all pans out (the speculated value matched the value when "a"
was actually later read), and you get a "non-causal" result.

Now, nobody actually does this kind of data speculation as far as I know,
and there are perfectly valid arguments for why outside of control
speculation nobody likely will (at least partly due to the fact that it
would screw up existing expectations for memory ordering). It's also
pretty damn complicated to do. But data speculation has certainly been a
research subject, and there are papers on it.

> > Remember: the smp_wmb() only orders on the _writer_ side. Not on the
> > reader side. The writer may send out the stuff in a particular order, but
> > the reader might see them in a different order because _it_ might queue
> > the bus events internally for its caches (in particular, it could end up
> > delaying updating a particular way in the cache because it's busy).
>
> Ummm... So whilst smp_wmb() commits writes to the mercy of the cache coherency
> system in a particular order, the updates can be passed over from one cache to
> another and committed to the reader's cache in any order, and can even be
> delayed:

Right. You should _always_ have as a rule of thinking that a "smp_wmb()"
on one side absolutely _has_ to be paired with a "smp_rmb()" on the other
side. If they aren't paired, something is _wrong_.

Now, the data-dependent reads is actually a very specific optimization
where we say that on certain architectures you don't need it, so we relax
the rule to be "the reader has to have a smp_rmb() _or_ a
smp_read_barrier_depends(), where the latter is only valid if the address
of the dependent read depends directly on the first one".

But the read barrier always has to be there, even though it can be of the
"weaker" type.

And note that the address really has to have a _data_ dependency, not a
control dependency. If the address is dependent on the first read, but the
dependency is through a conditional rather than actually reading the
address itself, then it's a control dependency, and existing CPU's already
short-circuit those through branch prediction.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/