Re: [PATCH] iommu/arm-smmu-v3: Allow disabling Stage 1 translation

From: Jason Gunthorpe

Date: Thu Apr 23 2026 - 18:37:27 EST


On Thu, Apr 23, 2026 at 06:07:23PM +0100, Will Deacon wrote:

> I don't think it's that odd given that the STE/CD entries are bigger
> than PTEs and the SMMU permits a lot more relaxations about how they are
> accessed and cached compared to the PTW.

Well I'm not sure bigger really matters, but I wasn't aware there was
a spec relaxation here that would make the cachable path not viable
for STE but not PTW...

> Having said that, the page-table code looks broken to me even in the
> coherent case:
>
> ptep[i] = pte | paddr_to_iopte(paddr + i * sz, data);
>
> as the compiler can theoretically make a right mess of that.

Heh, great. The iommupt stuff does better.. It does a 64 bit cmpxchg
to store a table pointer and a 64 bit WRITE_ONCE to store the pte,
then a CMO through the DMA API.

DMA API has to guarentee the right ordering, so we only have the
question below:

> > STE/CD is pretty simple now, there is only one place to put the CMO
> > and the ordering is all handled with that shared code. We no longer
> > care about ordering beyond all the writes must be visible to HW before
> > issuing the CMDQ invalidation command - which is the same environment
> > as the pagetable.
>
> You presumably rely on 64-bit single-copy atomicity for hitless updates,
> no?

Yes, just like the page table does..

I hope that's not a problem or we have a issue with the PTW :)

> > I also don't like this "lot of systems thing". I don't want these
> > powerful capabilities locked up in some giant CSP's proprietary
> > kernel. I want all the companies in the cloud market to have access
> > to the same feature set. That's what open source is supposed to be
> > driving toward. I have several interesting use cases for this
> > functionality already.
>
> Sorry, the point here was definitely _not_ about keeping this out of
> tree, nor was I trying to say that this stuff isn't important. But the
> mobile world doesn't give a hoot about KHO and _does_ tend to care about
> the impact of CMO, so we have to find a way to balance the two worlds.

Yes, that make sense.

My argument is that the CMO on STE/CD shouldn't bother mobile, you
could even view it as an micro-optimization because we do occasionally
read-back the STE/CD fields.

But if you say the SMM STE/CD fetch doesn't have to follow the single
copy rules and PTW does, then ok..

And if Samiullah can tackle dma_alloc_coherent then maybe the whole
question is moot.

Jason