Re: [PATCH 00/14] alpha: cleanups for 6.10

From: Paul E. McKenney
Date: Fri May 31 2024 - 15:34:01 EST


On Fri, May 31, 2024 at 04:56:28AM +0100, Maciej W. Rozycki wrote:
> On Wed, 29 May 2024, Paul E. McKenney wrote:
>
> > > Mind that the read-modify-write sequence that software does for sub-word
> > > write accesses with original Alpha hardware is precisely what hardware
> > > would have to do anyway and support for that was deliberately omitted by
> > > the architecture designers from the ISA to give it performance advantages
> > > quoted in the architecture manual. The only difference here is that with
> > > hardware read-modify-write operations atomicity for sub-word accesses is
> > > guaranteed by the ISA, however for software read-modify-write it has to be
> > > explictly coded using the usual load-locked/store-conditional sequence in
> > > a loop. I don't think it's a big deal really, it should be trivial to do
> > > in the relevant accessors, along with the memory barriers that are needed
> > > anyway for EV56+ and possibly other ports such as the MIPS one.
> >
> > There shouldn't be any memory barriers required, and don't EV56+ have
> > single-byte loads and stores?
>
> I should have commented on this in my original reply.
>
> You're the RCU expert so you know the answer. I don't. If it's OK for
> successive writes to get reordered, or readers to see a stale value, then
> you don't need memory barriers. Otherwise you do. Whether byte accesses
> are available or not does not matter, the CPU *will* do reordering if it's
> allowed to (or more specifically, it won't do anything to prevent it from
> happening, especially in SMP configurations; I can't remember offhand if
> there are cases with UP). Also adjacent byte writes may be merged, but I
> suppose it does not matter, or does it?

RCU uses whichever wrapper is required. For example, if ordering is
required, it might use smp_store_release() and smp_load_acquire().
If ordering does not matter, it might use WRITE_ONCE() and READ_ONCE().
If tearing/fusing/merging does not matter, as in there are not concurrent
accesses, it uses plain C-language loads and stores.

> NB MIPS has similar architectural arrangements (and a bunch of barriers
> defined in the ISA), it's just most implementations are actually strongly
> ordered, so most people can't see the effects of this. With MIPS I know
> for sure there are cases of UP reordering, but they only really matter for
> MMIO use cases and not regular memory.

Any given architecture is required to provide architecture-specific
implementations of the various functions that meet the requirements of
Linux-kernel memory model. See tools/memory-model for more information.

Thanx, Paul