On Wed, Nov 15, 2023 at 11:02:26AM +0800, Lu Baolu wrote:
The iopf_queue_flush_dev() is called by the iommu driver before releasingThis needs more explanation, why should anyone care?
a PASID. It ensures that all pending faults for this PASID have been
handled or cancelled, and won't hit the address space that reuses this
PASID. The driver must make sure that no new fault is added to the queue.
More importantly, why is*discarding* the right thing to do?
Especially why would we discard a partial page request group?
After we change a translation we may have PRI requests in a
queue. They need to be acknowledged, not discarded. The DMA in the
device should be restarted and the device should observe the new
translation - if it is blocking then it should take a DMA error.
More broadly, we should just let things run their normal course. The
domain to deliver the fault to should be determined very early. If we
get a fault and there is no fault domain currently assigned then just
restart it.
The main reason to fence would be to allow the domain to become freed
as the faults should be holding pointers to it. But I feel there are
simpler options for that then this..
The SMMUv3 driver doesn't use it because it only implements theThis explanation doesn't make much sense, from a device driver
Arm-specific stall fault model where DMA transactions are held in the SMMU
while waiting for the OS to handle iopf's. Since a device driver must
complete all DMA transactions before detaching domain, there are no
pending iopf's with the stall model. PRI support requires adding a call to
iopf_queue_flush_dev() after flushing the hardware page fault queue.
perspective both PRI and stall cause the device to not complete DMAs.
The difference between stall and PRI is fairly small, stall causes an
internal bus to lock up while PRI does not.
-int iopf_queue_flush_dev(struct device *dev)A naked flush_workqueue like this is really suspicious, it needs a
+int iopf_queue_discard_dev_pasid(struct device *dev, ioasid_t pasid)
{
struct iommu_fault_param *iopf_param = iopf_get_dev_fault_param(dev);
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
+ struct iommu_page_response resp;
+ struct iopf_fault *iopf, *next;
+ int ret = 0;
if (!iopf_param)
return -ENODEV;
flush_workqueue(iopf_param->queue->wq);
+
comment explaining why the queue can't get more work queued at this
point.
I suppose the driver is expected to stop calling
iommu_report_device_fault() before calling this function, but that
doesn't seem like it is going to be possible. Drivers should be
implementing atomic replace for the PASID updates and in that case
there is no momement when it can say the HW will stop generating PRI.