Re: [PATCH] thunderbolt: Make iommu_dma_protection more accurate

From: Robin Murphy
Date: Fri Mar 18 2022 - 11:15:30 EST


On 2022-03-18 14:47, mika.westerberg@xxxxxxxxxxxxxxx wrote:
On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
On 2022-03-18 13:25, mika.westerberg@xxxxxxxxxxxxxxx wrote:
Hi Robin,

On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
This adds quite a lot code and complexity, and honestly I would like to
keep it as simple as possible (and this is not enough because we need to
make sure the DMAR bit is there so that none of the possible connected
devices were able to overwrite our memory already).

Shall we forget the standalone sibling check and just make the
pdev->untrusted check directly in tb_acpi_add_link() then?

I think we should leave tb_acpi_add_link() untouched if possible ;-)
This is because it is used to add the device links from firmware
description that we need for proper power management of the tunneled
devices. It has little to do with the identification of the external
facing DMA-capable PCIe ports.

Furthermore these links only exists in USB4 software connection manager
systems so we do not have those in the existing Thunderbolt 3/4 systems
that use firmware based connection manager (pretty much all out there).

On reflection I guess the DMAR bit makes iommu_dma_protection
functionally dependent on ACPI already, so we don't actually lose
anything (and anyone can come back and revisit firmware-agnostic
methods later if a need appears).

I agree.

OK, so do we have any realistic options for identifying the correct PCI
devices, if USB4 PCIe adapters might be anywhere relative to their
associated NHI? Short of maintaining a list of known IDs, the only thought I
have left is that if we walk the whole PCI segment looking specifically for
hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
*probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
false negatives might be tolerable, but it still feels like a bit of a
sketchy heuristic.

Indeed.

I suppose we could just look to see if any device anywhere is marked as
external-facing, and hope that if firmware's done that much then it's done
everything right. That's still at least slightly better than what we have
today, but AFAICS still carries significant risk of a false positive for an
add-in card that firmware didn't recognise.

The port in this case, that is marked as external facing, is the PCIe
root port that the add-in-card is connected to and that is known for the
firmware in advance.

I'm satisfied that we've come round to the right conclusion on the DMAR
opt-in - I'm in the middle or writing up patches for that now - but even
Microsoft's spec gives that as a separate requirement from the flagging of
external ports, with both being necessary for Kernel DMA Protection.

Is the problem that we are here trying to solve the fact that user can
disable the IOMMU protection from the command line? Or the fact that the
firmware might not declare all the ports properly so we may end up in a
situation that some of the ports do not get the full IOMMU protection.

It's about knowing whether or not firmware has declared the ports at all. If it hasn't then the system is vulnerable to *some* degree of DMA attacks regardless of anything else (the exact degree depending on kernel config and user overrides). Complete mitigation is simply too expensive to apply by default to every device the IOMMU layer is unsure about. The Thunderbolt driver cannot be confident that protection is in place unless it can somehow know that the IOMMU layer has seen that untrusted property on the relevant ports.

These are Microsoft requirements for the OEMs in order to pass their
firmware test suite so here I would not expect to have issues. Otherwise
they simply cannot ship the thing with Windows installed.

IMHO we should just trust the firmare provided information here
(otherwise we are screwed anyway as there is no way to tell if the
devices connected prior the OS can still do DMA), and use the external
facing port indicator to idenfity the ports that need DMA protection.

Indeed that's exactly what I want to do, but it begs the question of how we *find* the firmware-provided information in the first place!

I seem to have already started writing the dumb version that will walk the whole PCI segment and assume the presence of any external-facing port implies that we're good. Let me know if I should stop ;)

Cheers,
Robin.