Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs

From: Jason Gunthorpe
Date: Wed Jun 01 2022 - 08:33:32 EST


On Wed, Jun 01, 2022 at 01:18:52PM +0100, Joao Martins wrote:

> > So having safe racy reading in the kernel is probably best, and so RCU
> > would be good here too.
>
> Reading dirties ought to be similar to map/unmap but slightly simpler as
> I supposedly don't need to care about the pte changing under the hood (or
> so I initially thought). I was wrestling at some point if test-and-clear
> was enough or whether I switch back cmpxchg to detect the pte has changed
> and only mark dirty based on the old value[*]. The latter would align with
> how map/unmap performs the pte updates.

test-and-clear should be fine, but this all needs to be done under a
RCU context while the page tables themsevles are freed by RCU. Then
you can safely chase the page table pointers down to each level
without fear of UAF.

> I am not sure yet on dynamic demote/promote of page sizes if it changes this.

For this kind of primitive the caller must provide the locking, just
like map/unmap.

Effectively you can consider the iommu_domain has having externally
provided range-locks over the IOVA space. map/unmap/demote/promote
must run serially over intersecting IOVA ranges.

In terms of iommufd this means we always have to hold a lock related
to the area (which is the IOVA range) before issuing any iommu call on
the domain.

Jason