On Tue, Aug 06, 2024 at 09:49:43AM -0300, Jason Gunthorpe wrote:
On Mon, Aug 05, 2024 at 03:32:50PM +0000, Pranjal Shrivastava wrote:Hmm, I agree, I don't see `iommu_report_device_fault` be called anywhere
Here's the updated diff:I'm not sure this make sense??
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a31460f9f3d4..ed2b106e02dd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1777,7 +1777,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
goto out_unlock;
}
- iommu_report_device_fault(master->dev, &fault_evt);
+ ret = iommu_report_device_fault(master->dev, &fault_evt);
out_unlock:
mutex_unlock(&smmu->streams_mutex);
return ret;
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 0e3a9b38bef2..7684e7562584 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -532,6 +532,9 @@ void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
bool last_page;
u16 sid;
+ if (!evt)
+ return;
+
The point of this path is for the driver to retire the fault with a
failure. This prevents that from happing on Intel and we are back to
loosing track of a fault.
All calls to iommu_report_device_fault() must result in
page_response() properly retiring whatever the event was.
+static void iopf_error_response(struct device *dev, struct iommu_fault *fault)The issue originates here, why is this NULL?
+{
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
+ struct iommu_page_response resp = {
+ .pasid = fault->prm.pasid,
+ .grpid = fault->prm.grpid,
+ .code = IOMMU_PAGE_RESP_INVALID
+ };
+
+ ops->page_response(dev, NULL, &resp);
+}
void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt)
{
The caller has an evt? I think we should pass it down.
with a NULL evt. Hence, it does make sense to pass the evt down and
ensure we don't lose track of the event.
I'm assuming that we retired the if (!evt) check from intel->page
response since we didn't have any callers of intel->page_response
with a NULL evt. (Atleast, for now, I don't see that happen).
Lu, Will -- Any additional comments/suggestions for this?