Re: [PATCH v3 4/4] iommu/vt-d: Add page response ops support

From: Lu Baolu
Date: Fri Jul 10 2020 - 04:17:36 EST


Hi Kevin,

On 2020/7/10 13:49, Tian, Kevin wrote:
From: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Friday, July 10, 2020 1:37 PM

Hi Kevin,

On 2020/7/10 10:42, Tian, Kevin wrote:
From: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Thursday, July 9, 2020 3:06 PM

After page requests are handled, software must respond to the device
which raised the page request with the result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds
the
VT-d implementation of page_response ops.

Co-developed-by: Jacob Pan<jacob.jun.pan@xxxxxxxxxxxxxxx>
Signed-off-by: Jacob Pan<jacob.jun.pan@xxxxxxxxxxxxxxx>
Co-developed-by: Liu Yi L<yi.l.liu@xxxxxxxxx>
Signed-off-by: Liu Yi L<yi.l.liu@xxxxxxxxx>
Signed-off-by: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
---
drivers/iommu/intel/iommu.c | 1 +
drivers/iommu/intel/svm.c | 100
++++++++++++++++++++++++++++++++++++
include/linux/intel-iommu.h | 3 ++
3 files changed, 104 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 4a6b6960fc32..98390a6d8113 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -6057,6 +6057,7 @@ const struct iommu_ops intel_iommu_ops = {
.sva_bind = intel_svm_bind,
.sva_unbind = intel_svm_unbind,
.sva_get_pasid = intel_svm_get_pasid,
+ .page_response = intel_svm_page_response,
#endif
};

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d24e71bac8db..839d2af377b6 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -1082,3 +1082,103 @@ int intel_svm_get_pasid(struct iommu_sva
*sva)
return pasid;
}
+
+int intel_svm_page_response(struct device *dev,
+ struct iommu_fault_event *evt,
+ struct iommu_page_response *msg)
+{
+ struct iommu_fault_page_request *prm;
+ struct intel_svm_dev *sdev = NULL;
+ struct intel_svm *svm = NULL;
+ struct intel_iommu *iommu;
+ bool private_present;
+ bool pasid_present;
+ bool last_page;
+ u8 bus, devfn;
+ int ret = 0;
+ u16 sid;
+
+ if (!dev || !dev_is_pci(dev))
+ return -ENODEV;
+
+ iommu = device_to_iommu(dev, &bus, &devfn);
+ if (!iommu)
+ return -ENODEV;
+
+ if (!msg || !evt)
+ return -EINVAL;
+
+ mutex_lock(&pasid_mutex);
+
+ prm = &evt->fault.prm;
+ sid = PCI_DEVID(bus, devfn);
+ pasid_present = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+ private_present = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
+ last_page = prm->flags &
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+
+ if (pasid_present) {
+ if (prm->pasid == 0 || prm->pasid >= PASID_MAX) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ ret = pasid_to_svm_sdev(dev, prm->pasid, &svm, &sdev);
+ if (ret || !sdev) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ /*
+ * For responses from userspace, need to make sure that the
+ * pasid has been bound to its mm.
+ */
+ if (svm->flags & SVM_FLAG_GUEST_MODE) {
+ struct mm_struct *mm;
+
+ mm = get_task_mm(current);
+ if (!mm) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (mm != svm->mm) {
+ ret = -ENODEV;
+ mmput(mm);
+ goto out;
+ }
+
+ mmput(mm);
+ }
+ } else {
+ pr_err_ratelimited("Invalid page response: no pasid\n");
+ ret = -EINVAL;
+ goto out;
check pasid=0 first, then no need to indent so many lines above.
Yes.

+ }
+
+ /*
+ * Per VT-d spec. v3.0 ch7.7, system software must respond
+ * with page group response if private data is present (PDP)
+ * or last page in group (LPIG) bit is set. This is an
+ * additional VT-d requirement beyond PCI ATS spec.
+ */
What is the behavior if system software doesn't follow the requirement?
en... maybe the question is really about whether the information in prm
comes from userspace or from internally-recorded info in iommu core.
The former cannot be trusted. The latter one is OK.
We require a page response when reporting such event. The upper layer
(IOMMU core or VFIO) will be implemented with a timer, if userspace
doesn't respond in time, the timer will get expired and a FAILURE
response will be sent to device.
Yes, timer helps when userspace doesn't respond. Then I'm fine with
this patch.

Reviewed-by: Kevin Tian<kevin.tian@xxxxxxxxx>

btw when you say IOMMU core or VFIO, does it mean the timer mechanism
is not implemented yet?


It's in local tree, not upstream yet.

Best regards,
baolu