Re: [PATCH v9 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()
From: Will Deacon
Date: Mon Jan 26 2026 - 08:01:31 EST
On Fri, Jan 23, 2026 at 04:03:27PM -0400, Jason Gunthorpe wrote:
> On Fri, Jan 23, 2026 at 05:10:52PM +0000, Will Deacon wrote:
> > On Fri, Jan 23, 2026 at 05:05:31PM +0000, Will Deacon wrote:
> > > On Fri, Dec 19, 2025 at 12:11:28PM -0800, Nicolin Chen wrote:
> > > > + /*
> > > > + * We are committed to updating the STE. Ensure the invalidation array
> > > > + * is visible to concurrent map/unmap threads, and acquire any racing
> > > > + * IOPTE updates.
> > > > + *
> > > > + * [CPU0] | [CPU1]
> > > > + * |
> > > > + * change IOPTEs and TLB flush: |
> > > > + * arm_smmu_domain_inv_range() { | arm_smmu_install_old_domain_invs {
> > > > + * ... | rcu_assign_pointer(new_invs);
> > > > + * smp_mb(); // ensure IOPTEs | smp_mb(); // ensure new_invs
> > > > + * ... | kfree_rcu(old_invs, rcu);
> > > > + * // load invalidation array | }
> > > > + * invs = rcu_dereference(); | arm_smmu_install_ste_for_dev {
> > > > + * | STE = TTB0 // read new IOPTEs
> > > > + */
> > > > + smp_mb();
> > >
> > > I don't think we need to duplicate this comment three times, you can just
> > > refer to the first function (e.g. "See ordering comment in
> > > arm_smmu_domain_inv_range()").
> > >
> > > However, isn't the comment above misleading for this case?
> > > arm_smmu_install_old_domain_invs() has the sequencing the other way
> > > around on CPU 1: we should update the STE first.
> >
> > I also think we probably want a dma_mb() instead of an smp_mb() for all
> > of these examples? It won't make any practical difference but I think it
> > helps readability given that one of the readers is the PTW.
>
> The only actual dma_wmb() is inside arm_smmu_install_ste_for_dev()
> after updating the STE. Adding that line explicitly would help as that
> is the only point where we must have the writes actually visible to
> the DMA HW.
>
> The ones written here as smp_mb() are not required to be DMA ones and
> could all be NOP's on UP..
Hmm, I'm not sure about that.
If we've written a new (i.e. previously invalid) valid PTE to a
page-table and then we install that page-table into an STE hitlessly
(let's say we write the S2TTB field) then isn't there a window before we
do the STE invalidation where the page-table might be accessible to the
SMMU but the new PTE is still sitting in the CPU?
i.e. we can't rely on the command insertion barrier for that.
Will