Re: [PATCH 2/4] iommu: Use group ownership to avoid driver attachment

From: Robin Murphy
Date: Wed Feb 15 2023 - 06:09:35 EST


On 2023-02-15 07:28, Baolu Lu wrote:
On 2023/2/15 14:56, Tian, Kevin wrote:
From: Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Wednesday, February 15, 2023 1:51 PM

On 2/13/23 10:19 PM, Jason Gunthorpe wrote:
On Mon, Feb 13, 2023 at 03:49:39PM +0800, Lu Baolu wrote:
@@ -2992,6 +2987,14 @@ static ssize_t iommu_group_store_type(struct
iommu_group *group,
       else
           return -EINVAL;

+    if (req_type != IOMMU_DOMAIN_DMA_FQ ||
+        group->default_domain->type != IOMMU_DOMAIN_DMA) {
+        ret = iommu_group_claim_dma_owner(group, (void *)buf);
+        if (ret)
+            return ret;
+        group_owner_claimed = true;
+    }
I don't get it, this should be done unconditionally. If we couldn't
take ownership then we simply can't progress.
The existing code allows the user to switch the default domain from
strict to lazy invalidation mode. The default domain is not changed,
hence it should be seamless and transparent to the device driver.
Is there real usage relying on this transition for a bound device?

In concept strict->lazy transition implies relaxed DMA security. It's hard
to think of a motivation of doing so while the device might be doing
in-fly DMAs.

Presumably such perf/security tradeoff should be planned way before
binding device/driver together.

btw if strict->lazy is allowed why lazy->strict is prohibited?


We all know, strict vs. lazy is a tradeoff between performance and
security.

strict -> lazy: driver works in secure mode. This transition trades off
security for better performance.

lazy->strict: The driver is already working in non-safety mode. This
transition only results in worse performance. It makes no sense. If user
want to put the driver in a secure mode, they need to unbind the driver,
reset the device and do the lazy->strict transition.

Robin might have better insights.

Yes, this was added for a definite use-case in ChromeOS, where strict->lazy needs to support being done "live" since the device in question is the storage controller for the already-mounted root filesystem. Your reasoning seems to match what I summarised in the original commit message :)

Thanks,
Robin.