Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

From: Lu Baolu
Date: Wed Apr 13 2022 - 07:14:21 EST


Hi Dave,

On 2022/4/12 23:35, Dave Hansen wrote:
On 4/12/22 08:10, Jean-Philippe Brucker wrote:
I wonder if the Intel and ARM IOMMU code differ in the way they keep
references to the mm, or if this affects Intel as well, but we just
haven't tested the code enough.
The Arm code was written expecting the PASID to be freed on unbind(), not
mm exit. I missed the change of behavior, sorry (I thought your plan was
to extend PASID lifetime, not shorten it?) but as is it seems very broken.
For example in the iommu_sva_unbind_device(), we have
arm_smmu_mmu_notifier_put() clearing the PASID table entry for
"mm->pasid", which is going to end badly if the PASID has been cleared or
reallocated. We can't clear the PASID entry in mm exit because at that
point the device may still be issuing DMA for that PASID and we need to
quiesce the entry rather than deactivate it.

I think we ended up flipping some of this around on the Intel side.
Instead of having to quiesce the device on mm exit, we don't let the mm
exit until the device is done.

The Intel IOMMU code doesn't quiesce the device on mm exit. It only
tears down the PASID entry so that the subsequent device accesses to mm
is dropped silently.

Just like ARM, Intel IOMMU code also expects that PASID should be freed
and reused after device is done (i.e. after iommu_sva_unbind_device())
so that the PASID could be drained in both hardware and software before
reusing it for other purpose.


When you program the pasid into the device, it's a lot like when you
create a thread. We bump the reference count on the mm when we program
the page table pointer into a CPU. We drop the thread's reference to
the mm when the thread exits and will no longer be using the page tables.

Same thing with pasids. We bump the refcount on the mm when the pasid
is programmed into the device. Once the device is done with the mm, we
drop the mm.

Basically, instead of recounting the pasid itself, we just refcount the mm.

Above makes sense to me. It guarantees that the mm->pasid could only be
freed and reused after the device is done.

Best regards,
baolu