Re: [PATCH RFC 1/1] iommu: set the default iommu-dma mode as non-strict

From: Robin Murphy
Date: Mon Mar 04 2019 - 10:52:42 EST

On 02/03/2019 06:12, Leizhen (ThunderTown) wrote:

On 2019/3/1 19:07, Jean-Philippe Brucker wrote:
Hi Leizhen,

On 01/03/2019 04:44, Leizhen (ThunderTown) wrote:

On 2019/2/26 20:36, Hanjun Guo wrote:
Hi Jean,

On 2019/1/31 22:55, Jean-Philippe Brucker wrote:

On 31/01/2019 13:52, Zhen Lei wrote:
Currently, many peripherals are faster than before. For example, the top
speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But
when iommu page-table mapping enabled, it's hard to reach the top speed
in strict mode, because of frequently map and unmap operations. In order
to keep abreast of the times, I think it's better to set non-strict as

Most users won't be aware of this relaxation and will have their system
vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred
Invalidation in
Hi Jean,

In fact, we have discussed the vulnerable of deferred invalidation before upstream
the non-strict patches. The attacks maybe possible because of an untrusted device or
the mistake of the device driver. And we limited the VFIO to still use strict mode.
As mentioned in the pdf, limit the freed memory with deferred invalidation only to
be reused by the device, can mitigate the vulnerability. But it's too hard to implement
it now.
A compromise maybe we only apply non-strict to (1) dma_free_coherent, because the
memory is controlled by DMA common module, so we can make the memory to be freed after
the global invalidation in the timer handler. (2) And provide some new APIs related to
iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate device
drivers update the APIs if they want to improve performance. (3) Make sure that only
the trusted devices and trusted drivers can apply (1) and (2). For example, the driver
must be built into kernel Image.

Do we have a notion of untrusted kernel drivers? A userspace driver
It seems impossible to have such driver. The modules insmod by root users should be
guaranteed by themselves.

(VFIO) is untrusted, ok. But a malicious driver loaded into the kernel
address space would have much easier ways to corrupt the system than to
exploit lazy mode...
Yes, so that we have no need to consider untrusted drivers.

For (3), I agree that we should at least disallow lazy mode if
pci_dev->untrusted is set. At the moment it means that we require the
strictest IOMMU configuration for external-facing PCI ports, but it can
be extended to blacklist other vulnerable devices or locations.
I plan to add an attribute file for each device, espcially for hotplug devices. And
let the root users to decide which mode should be used, strict or non-strict. Becasue
they should known whether the hot-plug divice is trusted or not.

Aside from the problem that without massive implementation changes strict/non-strict is at best a per-domain property, not a per-device one, I can't see this being particularly practical - surely the whole point of a malicious endpoint is that it's going to pretend to be some common device for which a 'trusted' kernel driver already exists? If you've chosen to trust *any* external device, I think you may as well have just set non-strict globally anyway. The effort involved in trying to implement super-fine-grained control seems hard to justify.


If you do (3) then maybe we don't need (1) and (2), which require a
tonne of work in the DMA and IOMMU layers (but would certainly be nice
to see, since it would also help handle ATS invalidation timeouts)


So that some high-end trusted devices use non-strict mode, and keep others still using
strict mode. The drivers who want to use non-strict mode, should change to use new APIs
by themselves.

Why not keep the policy to secure by default, as we do for
iommu.passthrough? And maybe add something similar to
CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a
command-line argument or change the default config.

Sorry for the late reply, it was Chinese new year, and we had a long discussion
internally, we are fine to add a Kconfig but not sure OS vendors will set it
to default y.

OS vendors seems not happy to pass a command-line argument, to be honest,
this is our motivation to enable non-strict as default. Hope OS vendors
can see this email thread, and give some input here.