Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs
From: Joao Martins
Date: Wed Jun 01 2022 - 09:52:43 EST
On 6/1/22 13:33, Jason Gunthorpe wrote:
> On Wed, Jun 01, 2022 at 01:18:52PM +0100, Joao Martins wrote:
>
>>> So having safe racy reading in the kernel is probably best, and so RCU
>>> would be good here too.
>>
>> Reading dirties ought to be similar to map/unmap but slightly simpler as
>> I supposedly don't need to care about the pte changing under the hood (or
>> so I initially thought). I was wrestling at some point if test-and-clear
>> was enough or whether I switch back cmpxchg to detect the pte has changed
>> and only mark dirty based on the old value[*]. The latter would align with
>> how map/unmap performs the pte updates.
>
> test-and-clear should be fine, but this all needs to be done under a
> RCU context while the page tables themsevles are freed by RCU. Then
> you can safely chase the page table pointers down to each level
> without fear of UAF.
>
I was actually thinking more towards holding the same IOVA range lock to
align with the rest of map/unmap/demote/etc? All these IO page table
manip have all have the same performance requirements.
>> I am not sure yet on dynamic demote/promote of page sizes if it changes this.
>
> For this kind of primitive the caller must provide the locking, just
> like map/unmap.
>
Ah OK.
> Effectively you can consider the iommu_domain has having externally
> provided range-locks over the IOVA space. map/unmap/demote/promote
> must run serially over intersecting IOVA ranges.
>
> In terms of iommufd this means we always have to hold a lock related
> to the area (which is the IOVA range) before issuing any iommu call on
> the domain.
/me nods