Re: [PATCH v2] iommu: add support for drivers that manage iommu explicitly

From: Will Deacon
Date: Wed Jul 24 2019 - 06:51:27 EST


On Tue, Jul 23, 2019 at 10:40:55AM -0700, Rob Clark wrote:
> On Tue, Jul 23, 2019 at 8:38 AM Will Deacon <will@xxxxxxxxxx> wrote:
> >
> > On Mon, Jul 22, 2019 at 09:23:48AM -0700, Rob Clark wrote:
> > > On Mon, Jul 22, 2019 at 8:48 AM Joerg Roedel <joro@xxxxxxxxxx> wrote:
> > > >
> > > > On Mon, Jul 22, 2019 at 08:41:34AM -0700, Rob Clark wrote:
> > > > > It is set by the driver:
> > > > >
> > > > > https://patchwork.freedesktop.org/patch/315291/
> > > > >
> > > > > (This doesn't really belong in devicetree, since it isn't a
> > > > > description of the hardware, so the driver is really the only place to
> > > > > set this.. which is fine because it is about a detail of how the
> > > > > driver works.)
> > > >
> > > > It is more a detail about how the firmware works. IIUC the problem is
> > > > that the firmware initializes the context mappings for the GPU and the
> > > > OS doesn't know anything about that and just overwrites them, causing
> > > > the firmware GPU driver to fail badly.
> > > >
> > > > So I think it is the task of the firmware to tell the OS not to touch
> > > > the devices mappings until the OS device driver takes over. On x86 there
> > > > is something similar with the RMRR/unity-map tables from the firmware.
> > > >
> > >
> > > Bjorn had a patchset[1] to inherit the config from firmware/bootloader
> > > when arm-smmu is probed which handles that part of the problem. My
> > > patch is intended to be used on top of his patchset. This seems to me
> > > like the best solution, if we don't have control over the firmware.
> >
> > Hmm, but the feedback from Robin on the thread you cite was that this should
> > be generalised to look more like RMRR, so there seems to be a clear message
> > here.
> >
>
> Perhaps it is a lack of creativity, or lack of familiarity w/ iommu vs
> virtualization, but I'm not quite seeing how RMRR would help.. in
> particular when dealing with both DT and ACPI cases.

Well, I suppose we'd have something for DT and something for ACPI and we'd
try to make them look similar enough that we don't need lots of divergent
code in the kernel. The RMRR-like description would describe that, for a
particular device, a specific virt:phys mapping needs to exist in the
small window between initialising the SMMU and re-initialising the device
(GPU).

I would prefer this to be framebuffer-specific, since we're already in
flagrant violation of the arm64 boot requirements wrt ongoing DMA and making
this too general could lead to all sorts of brain damage. That would
probably also allow us to limit the flexibility, by mandating things like
alignment and memory attributes.

Having said that, I just realised I'm still a bit confused about the
problem: why does the bootloader require SMMU translation for the GPU at
all? If we could limit this whole thing to be identity-mapped/bypassed,
then all we need is a per-device flag in the firmware to indicate that the
SMMU should be initialised to allow passthrough for transactions originating
from that device.

> So I kinda prefer, when possible, if arm-smmu can figure out what is going
> on by looking at the hw state at boot (since that approach would work
> equally well for DT and ACPI).

That's not going to fly.

Forcing Linux to infer the state of the system by effectively parsing the
hardware configuration directly is fragile, error-prone and may not even be
possible in the general case. Worse, if this goes wrong, the symptoms are
very likely to be undiagnosable memory corruption, which is pretty awful in
my opinion.

Not only would you need separate parsing code for every IOMMU out there,
but you'd also need to make Linux aware of device aspects that it otherwise
doesn't care about, just in case the firmware decided to use them.
Furthermore, running an older kernel on newer hardware (which may have some
extensions), could cause the parsing to silently ignore parts of the device
that indicate memory regions which are in use. On top of that, there made be
device-global state that we are unable to reconfigure and that affect
devices other than the ones in question.

So no, I'm very much against this approach and the solution absolutely needs
to come in the form of a more abstract description from firmware.

> I *think* (but need to confirm if Bjorn hasn't already) that the
> memory for the pagetables that firmware/bootloader sets up is already
> removed from the memory map efi passes to kernel, so we don't need to
> worry about kernel stomping in-use pagetables.

It's precisely this sort of fragility that makes me nervous about this whole
approach.

Will