Re: [PATCH v4 2/2] arm64: io: apply the device store-release workaround once per block write
From: Will Deacon
Date: Tue Jun 30 2026 - 10:24:12 EST
On Mon, Jun 29, 2026 at 06:09:11PM -0500, Shanker Donthineni wrote:
> On 6/29/2026 5:48 AM, Vladimir Murzin wrote:
> > > + : "memory");
> > > + src += sizeof(u64);
> > > + dst += sizeof(u64);
> > > + count -= sizeof(u64);
> > > + }
> > > + while (count) {
> > > + asm volatile("strb %w0, [%1]"
> > > + : : "rZ"(*(const u8 *)src), "r"(dst) : "memory");
> > > + src++;
> > > + dst++;
> > > + count--;
> > > + }
> > > +
> > > + iomem_block_store_barrier();
> > It is perhaps a matter of taste, but having the inline assembly
> > here (and in memset_io()) might make the code clearer. To a
> > casual reader, it would be obvious that the barrier is not
> > guaranteed and is only applicable to ARM64_WORKAROUND_DEVICE_STORE_RELEASE,
> > without having to jump back and forth through the code.
> >
> > Obliviously maintainers might have different preference ;)
Oblivious maintainer here :)
> Regarding the barrier, iomem_block_store_barrier() is declared
> static __always_inline, so it does not add a function call. The nop/dmb
> osh alternative is emitted directly in each caller. I used the helper to
> avoid duplicating the alternative sequence.
>
> I understand that placing the assembly directly in both functions could
> make its conditional nature more obvious. I do not have a strong preference
> and am happy to follow Will’s and Catalin’s preference here.
I agree with Vladimir that it would be clearer to inline the conditional
barrier.
It would be even better if we could avoid having to duplicate this code
to start with, but I can't immediately think of a better alternative.
Will