Re: [PATCH v3] arm64: errata: Workaround NVIDIA Olympus device store/load ordering erratum

From: Shanker Donthineni

Date: Tue Jun 16 2026 - 09:24:15 EST


Hi Will,

On 6/12/2026 7:48 AM, Jason Gunthorpe wrote:
On Thu, Jun 11, 2026 at 08:13:48PM -0500, Shanker Donthineni wrote:

For the scalar MMIO helpers, the workaround promotes the raw writes to
store-release on affected CPUs as v1/v2 shown below. For the memcpy-toIO
helpers, could you please clarify the specific reason for adding a dmb despite
the documented no-ordering contract? Is the concern that some drivers may
be relying on ordering across memcpy_toio_*() today even though the API
does not guarantee it, and that we should cover those cases defensively?
I think given how arm implements them today the iocopy's are actually
the _relaxed variations.. I wonder if this matters to any user?

Following Jason's observation that on arm64 the memcpy_toio() /__iowrite{32,64}_copy() helpers are effectively the relaxed (write-combining) variants, I'd like to settle one open point before posting v4: should the workaround also promote dgh() > dmb on affected CPUs (now Olympus core), or leave dgh() as a plain hint?

If you'd still prefer the dmb defensively, to cover drivers that may rely on ordering across memcpy_toio() today despite the relaxed contract, I'm happy to fold it into v4.


Please let me know how you'd like me to proceed.


-Shanker