Re: [PATCH RFC] Avoid memory barrier in read_seqcount() through load acquire

From: Linus Torvalds
Date: Mon Aug 19 2024 - 12:26:30 EST


On Mon, 19 Aug 2024 at 01:46, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> If you cannot disclose that for some reason, just say "on my ARM64 test
> machine" or something like that, so that we're not implying that this is
> true for all ARM64 implementations.

It's the same machine I have - an Ampere Altra. It's a standard
Neoverse N1 core, afaik.

It might also be a good idea to just point to the ARM documentation,
although I don't know how stable those web addresses are:

https://developer.arm.com/documentation/102336/0100/Load-Acquire-and-Store-Release-instructions

and quoting the relevant part on that page:

"Weaker ordering requirements that are imposed by Load-Acquire and
Store-Release instructions allow for micro-architectural
optimizations, which could reduce some of the performance impacts that
are otherwise imposed by an explicit memory barrier.

If the ordering requirement is satisfied using either a Load-Acquire
or Store-Release, then it would be preferable to use these
instructions instead of a DMB"

where that last sentence is basically ARM saying that load-acquire is
better than load+DMB and should be preferred.

Linus