RE: [PATCH v11 13/16] iommu: Improve iopf_queue_remove_device()

From: Tian, Kevin
Date: Mon Feb 05 2024 - 04:01:12 EST


> From: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> Sent: Tuesday, January 30, 2024 4:09 PM
> *
> - * Caller makes sure that no more faults are reported for this device.
> + * Removing a device from an iopf_queue. It's recommended to follow
> these
> + * steps when removing a device:
> *
> - * Return: 0 on success and <0 on error.
> + * - Disable new PRI reception: Turn off PRI generation in the IOMMU
> hardware
> + * and flush any hardware page request queues. This should be done
> before
> + * calling into this helper.

this 1st step is already not followed by intel-iommu driver. The Page
Request Enable (PRE) bit is set in the context entry when a device
is attached to the default domain and cleared only in
intel_iommu_release_device().

but iopf_queue_remove_device() is called when IOMMU_DEV_FEAT_IOPF
is disabled e.g. when idxd driver is unbound from the device.

so the order is already violated.

> + * - Acknowledge all outstanding PRQs to the device: Respond to all
> outstanding
> + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device
> should
> + * not retry. This helper function handles this.
> + * - Disable PRI on the device: After calling this helper, the caller could
> + * then disable PRI on the device.

intel_iommu_disable_iopf() disables PRI cap before calling this helper.

> + * - Tear down the iopf infrastructure: Calling iopf_queue_remove_device()
> + * essentially disassociates the device. The fault_param might still exist,
> + * but iommu_page_response() will do nothing. The device fault parameter
> + * reference count has been properly passed from
> iommu_report_device_fault()
> + * to the fault handling work, and will eventually be released after
> + * iommu_page_response().

it's unclear what 'tear down' means here.