Re: [PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface
From: Will Deacon
Date: Tue Aug 03 2021 - 06:36:09 EST
On Mon, Aug 02, 2021 at 03:15:50PM +0100, Robin Murphy wrote:
> On 2021-08-02 14:04, Will Deacon wrote:
> > On Wed, Jul 28, 2021 at 04:58:44PM +0100, Robin Murphy wrote:
> > > To make io-pgtable aware of a flush queue being dynamically enabled,
> > > allow IO_PGTABLE_QUIRK_NON_STRICT to be set even after a domain has been
> > > attached to, and hook up the final piece of the puzzle in iommu-dma.
> > >
> > > Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx>
> > > ---
> > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++++++++
> > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 11 +++++++++++
> > > drivers/iommu/dma-iommu.c | 3 +++
> > > 3 files changed, 29 insertions(+)
> > >
> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > index 19400826eba7..40fa9cb382c3 100644
> > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > @@ -2711,6 +2711,20 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
> > > return ret;
> > > }
> > > +static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
> > > + unsigned long quirks)
> > > +{
> > > + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > +
> > > + if (quirks == IO_PGTABLE_QUIRK_NON_STRICT && smmu_domain->pgtbl_ops) {
> > > + struct io_pgtable *iop = io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> > > +
> > > + iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> > > + return 0;
> > > + }
> > > + return -EINVAL;
> > > +}
> >
> > I don't see anything serialising this against a concurrent iommu_unmap(), so
> > the ordering and atomicity looks quite suspicious to me here. I don't think
> > it's just the page-table quirks either, but also setting cookie->fq_domain.
>
> Heh, I confess to very much taking the cheeky "let's say nothing and see
> what Will thinks about concurrency" approach here :)
Damnit, I fell for that didn't I?
Overall, I'm really nervous about the concurrency here and think we'd be
better off requiring the unbind as we have for the other domain changes.
> The beauty of only allowing relaxation in the strict->non-strict direction
> is that it shouldn't need serialising as such - it doesn't matter if the
> update to cookie->fq_domain is observed between iommu_unmap() and
> iommu_dma_free_iova(), since there's no correctness impact to queueing IOVAs
> which may already have been invalidated and may or may not have been synced.
> AFAICS the only condition which matters is that the setting of the
> io-pgtable quirk must observe fq_domain being set. It feels like there must
> be enough dependencies on the read side, but we might need an smp_wmb()
> between the two in iommu_dma_init_fq()?
>
> I've also flip-flopped a bit on whether fq_domain needs to be accessed with
> READ_ONCE/WRITE_ONCE - by the time of posting I'd convinced myself that it
> was probably OK, but looking again now I suppose this wacky reordering is
> theoretically possible:
>
>
> iommu_dma_unmap() {
> bool free_fq = cookie->fq_domain; // == false
>
> iommu_unmap();
>
> if (!cookie->fq_domain) // observes new non-NULL value
> iommu_tlb_sync(); // skipped
>
> iommu_dma_free_iova { // inlined
> if (free_fq) // false
> queue_iova();
> else
> free_iova_fast(); // Uh-oh!
> }
> }
>
> so although I still can't see atomicity being a problem I guess we do need
> it for the sake of reordering at least.
With your changes, I think quite a few things can go wrong.
* cookie->fq_domain may be observed but iovad->fq could be NULL
* We can miss the smp_wmb() in the pgtable code but end up deferring the
IOVA reclaim
* iommu_change_dev_def_domain() only holds the group mutex afaict, so can
possibly run concurrently with itself on the same domain?
* iommu_dma_init_fq() can flip the domain type back from
IOMMU_DOMAIN_DMA_FQ to IOMMU_DOMAIN_DMA on the error path
* set_pgtable_quirks() isn't atomic, which probably is ok for now, but the
moment we use it anywhere else it's dangerous
and I suspect there are more :/
Will