Re: [PATCH] iommu: Fix def_domain_type interaction with untrusted devices

From: Jason Gunthorpe
Date: Tue Apr 16 2024 - 11:29:54 EST


On Tue, Apr 16, 2024 at 02:00:43PM +0100, Robin Murphy wrote:
> Previously, an untrusted device forcing IOMMU_DOMAIN_DMA always took
> precedence over whatever a driver's def_domain_type may have wanted to
> say. This was intentionally handled in core code since 3 years prior,
> to avoid drivers poking at the details of what is essentially a policy
> between the PCI core and the IOMMU core. Now, though, we go to the
> length of evaluating both constraints to check for any conflict, and if
> so throw our toys out of the pram and refuse to handle the device at
> all. Regardless of any intent, in practice this leaves the device, and
> potentially the rest of its group or even the whole IOMMU, in a largely
> undetermined state, which at worst may render the whole system
> unusable.

For systems supporting untrusted device security the translation must
be BLOCKED at this point.

> Unfortunately it turns out that this is a realistic situation to run
> into by connecting a PASID-capable device (e.g. a GPU) to an AMD-based
> laptop via a Thunderbolt expansion box, since the AMD IOMMU driver needs
> an identity default domain for PASIDs to be usable, and thus sets a
> def_domain_type override based on PASID capability.

The majority of places implementing def_domain_type are using it as a
statement of HW capability that should not be ignored by the core code:

- DART
* system page size is too small, can't map IOPTEs, force identity
* iommu does not support IDENTITY at all, force paging
- tegra: Device quirks mean paging and DMA API doesn't work
- amd: The driver can't support PAGING when in SNP mode
- SMMU: The driver can't support paging when in legacy binding mode and
paging domain allocation fails as well
- qcom: Looks like these devices have some iommu bypass bus in their
HW and paging doesn't work
- SMMUv3: The comment says HISI devices cannot support paging due to a HW quirk

For these force overriding the driver knowledge will either result in
domain allocate/attach failure or a broken DMA environment anyhow.

The AMD PASID thing is actually unique because the driver can still
fully support PAGING, despite it wrongly telling the core code that it
can't.

This is happening because it is all just a hack to work around the
incomplete SW implementation in the AMD driver. When the AMD driver is
completed its def_domain_type should be removed entirely.

Since actual PASID AMD attach isn't implemented yet we could just
remove that check from def_domain_type as an RC fix. Vasant can sort
it out properly by disabling pasid support on untrusted devices until
the DTE logic is fully completed.

> In general, restoring the old behaviour of forcing translation will not
> make that device's operation any more broken than leaving it potentially
> blocked or subject to the rest of a group's translations would, nor will
> it be any less safe than leaving it potentially bypassed or subject to
> the rest of a group's translations would, so do that, and let eGPUs work
> again.

Well, this is true, since we can't handle the probe error it doesn't
matter what we do.

Jason