Re: [PATCH v3 3/3] arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y
From: Linus Torvalds
Date: Tue Feb 17 2026 - 11:24:13 EST
On Mon, 16 Feb 2026 at 09:43, David Laight <david.laight.linux@xxxxxxxxx> wrote:
>
> >
> > Try doing something as simple as a "var++" on a volatile, and cry.
>
> On x86 I just see a load, inc, store - not that surprising really.
> (clang did do 'inc memory'.)
>
> It's not as though 'inc memory' is atomic (without a lock prefix).
That's not my point. My point is that it makes for absolutely
disgusting - and pointless - code generation.
That load + inc + store is a sign of the compiler missing truly basic
optimizations because "volatile" is so badly designed.
The thing is, we typically even *want* a single load. We actually want
not only to have basic optimizations that don't even change the
semantics - we typically even want CSE.
So we want basic optimizations and combining loads. The main reason to
use READ_ONCE() is actually a worry about compilers doing even worse
things, namely rematerialization or memory accesses - something that
compilers don't even do, because it's a bad idea, but people still
worry because they are _allowed_ to do it and who knows when something
silly happens.
So I want that READ_ONCE(), not "volatile" on data structures, because
*some* day we can rely on more modern things and compilers will
actually get it right if we do it as
#define READ_ONCE(ptr) __atomic_load_n(ptr, __ATOMIC_RELAXED)
or similar.
But last time I looked at it - which was admittedly a few years ago -
the compilers we supported didn't actually do anything reasonable here
(ie the built-in atomics were fundamentally worse than the ones we do
by hand, and even basic things like __atomic_load_n() weren't
actually; better than just using 'volatile'.
Maybe that has changed. We've upgraded minimum compilers since.
Linus