Re: [PATCH 04/18] alpha: Override READ_ONCE() with barriered implementation

From: Joel Fernandes
Date: Thu Jul 02 2020 - 11:07:37 EST


On Thu, Jul 2, 2020 at 10:55 AM Will Deacon <will@xxxxxxxxxx> wrote:
> On Thu, Jul 02, 2020 at 10:43:55AM -0400, Joel Fernandes wrote:
> > On Tue, Jun 30, 2020 at 1:38 PM Will Deacon <will@xxxxxxxxxx> wrote:
> > > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> > > index 92ec486a4f9e..2ecd068d91d1 100644
> > > --- a/arch/alpha/include/asm/barrier.h
> > > +++ b/arch/alpha/include/asm/barrier.h
> > > - * For example, the following code would force ordering (the initial
> > > - * value of "a" is zero, "b" is one, and "p" is "&a"):
> > > - *
> > > - * <programlisting>
> > > - * CPU 0 CPU 1
> > > - *
> > > - * b = 2;
> > > - * memory_barrier();
> > > - * p = &b; q = p;
> > > - * read_barrier_depends();
> > > - * d = *q;
> > > - * </programlisting>
> > > - *
> > > - * because the read of "*q" depends on the read of "p" and these
> > > - * two reads are separated by a read_barrier_depends(). However,
> > > - * the following code, with the same initial values for "a" and "b":
> > > - *
> >
> > Would it be Ok to keep this example in the kernel sources? I think it
> > serves as good documentation and highlights the issue in the Alpha
> > architecture well.
>
> I'd _really_ like to remove it, as I think it only serves to confuse people
> on a topic that is confusing enough already. Paul's perfbook [1] already has
> plenty of information about this, so I don't think we need to repeat that
> here. I could add a citation, perhaps?

True, and also found that LKMM docs and the memory-barriers.txt talks
about it, so removing it here sounds good to me. Maybe a reference
here to either documentation should be Ok.

> > BTW, do you know any architecture where speculative execution of
> > address-dependent loads can cause similar misorderings? That would be
> > pretty insane though. In Alpha's case it is not speculation but rather
> > the split local cache design as the docs mention. The reason I ask
> > is it is pretty amusing that control-dependent loads do have such
> > misordering issues due to speculative branch execution and I wondered
> > what other games the CPUs are playing. FWIW I ran into [1] which talks
> > about analogy between memory dependence and control dependence.
>
> I think you're asking about value prediction, and the implications it would
> have on address-dependent loads where the address can itself be predicted.

Yes.

> I'm not aware of an CPUs where that is observable architecturally.

I see.

> arm64 has some load instructions that do not honour address dependencies,
> but I believe that's mainly to enable alternative cache designs for things
> like non-temporal and large vector loads.

Good to know this, thanks.

- Joel