Re: [PATCH v4 1/2] arm64: errata: Workaround NVIDIA Olympus device store/load ordering
From: Will Deacon
Date: Tue Jun 30 2026 - 10:24:44 EST
On Thu, Jun 25, 2026 at 01:24:24PM -0500, Shanker Donthineni wrote:
> On systems with NVIDIA Olympus cores, a Device-nGnR* load can be
> observed by a peripheral before an older, non-overlapping Device-nGnR*
> store to the same peripheral. This breaks the program-order guarantee
> that software expects for Device-nGnR* accesses and can leave a
> peripheral in an incorrect state, as a load is observed before an
> earlier store takes effect.
>
> The erratum can occur only when all of the following apply:
>
> - A PE executes a Device-nGnR* store followed by a younger
> Device-nGnR* load.
> - The store is not a store-release.
> - The accesses target the same peripheral and do not overlap in bytes.
> - There is at most one intervening Device-nGnR* store in program
> order, and there are no intervening Device-nGnR* loads.
> - There is no DSB, and no DMB that orders loads, between the store and
> the load.
Does that mean that a DMB LD between the store and the load would
solve the problem?
It would be interesting to see how your benchmarks motivating patch 2
look if you leave __raw_writeX as-is and instead add a barrier in
__raw_readX before the load instruction.
Will