Re: [PATCH v4 09/10] iommu: Make iommu_queue_iopf() more generic

From: Baolu Lu
Date: Wed Sep 13 2023 - 02:19:07 EST


On 2023/9/13 10:34, Tian, Kevin wrote:
From: Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Monday, September 11, 2023 8:46 PM

On 2023/9/11 14:57, Tian, Kevin wrote:
From: Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Tuesday, September 5, 2023 1:24 PM

Hi Kevin,

I am trying to address this issue in below patch. Does it looks sane to
you?

iommu: Consolidate per-device fault data management

The per-device fault data is a data structure that is used to store
information about faults that occur on a device. This data is allocated
when IOPF is enabled on the device and freed when IOPF is disabled. The
data is used in the paths of iopf reporting, handling, responding, and
draining.

The fault data is protected by two locks:

- dev->iommu->lock: This lock is used to protect the allocation and
freeing of the fault data.
- dev->iommu->fault_parameter->lock: This lock is used to protect the
fault data itself.

Improve the iopf code to enforce this lock mechanism and add a
reference
counter in the fault data to avoid use-after-free issue.

Can you elaborate the use-after-free issue and why a new user count
is required?
I was concerned that when iommufd uses iopf, page fault report/response
may occur simultaneously with enable/disable PRI.

Currently, this is not an issue as the enable/disable PRI is in its own
path. In the future, we may discard this interface and enable PRI when
attaching the first PRI-capable domain, and disable it when detaching
the last PRI-capable domain.
Then let's not do it now until there is a real need after you have a
thorough design for iommufd.

I revisited this part of code and found that it's still valuable to make
the code clean and simple. The fault parameter is accessed in various
paths, such as reporting iopf, responding iopf, draining iopf's, adding
queue and removing queue. In each path, we need to repeat below locking
code:

mutex_lock(&dev->iommu->lock);
fault_param = dev->iommu->fault_param;
if (!fault_param) {
mutex_unlock(&dev->iommu->lock);
return -ENODEV;
}

/* use the fault parameter */
... ...

mutex_unlock(&dev->iommu->lock);

The order of the locks is also important. Otherwise, a possible deadlock
issue will be reported by lockdep.

By consolidating above code in iopf_get/put_dev_fault_param() helpers,
it could be simplified as:

fault_param = iopf_get_dev_fault_param(dev);
if (!fault_param)
return -ENODEV;

/* use the fault parameter */
... ...

iopf_put_dev_fault_param(fault_param);

The lock order issue is removed. And it will make the code simpler and
easier for maintenance.

Best regards,
baolu