Re: [PATCH v4 0/2] arm64: errata: NVIDIA Olympus device store/load ordering

From: Will Deacon

Date: Tue Jun 30 2026 - 09:59:06 EST


On Mon, Jun 29, 2026 at 06:08:37PM -0500, Shanker Donthineni wrote:
> On 6/29/2026 5:45 AM, Vladimir Murzin wrote:
> > That's interesting. With the way the patch set is structured, it
> > now looks like:
> >
> > 1. Fix the erratum, but cause a performance regression.
> > 2. Restore the performance regression and (re)apply the erratum
> > workaround.
> >
> > Would it make sense to avoid introducing the performance
> > regression in the first place by structuring the patch set
> > slightly differently?
> >
> > 1. (Re)introduce arm64 memset_io()/memcpy_toio().
> > 2. Fix the erratum once for all
> >
> > What do you reckon?
>
> Yes, that ordering makes sense.
>
> I can restructure v5 so that patch 1 introduces the arm64 memset_{to}io()
> implementations while preserving the existing behavior. Patch 2 will
> then add the complete erratum workaround, including the conditional
> trailing DMB for those block-write helpers. This avoids introducing
> the intermediate performance regression and keeps each commit
> independently usable.
>
> Will and Catalin, could you please share your thoughts on this approach?

tbh, I think I'm ok with the current ordering. The second patch is purely
a performance thing for affected CPUs, so doesn't strictly need to be
applied or backported for functional correctness afaict.

Will