Re: [PATCH v1 1/1] net: stmmac: dwmac-tegra: Read iommu stream id from device tree

From: Thierry Reding
Date: Fri Dec 06 2024 - 11:01:22 EST


On Fri, Dec 06, 2024 at 08:42:03AM -0500, Parker Newman wrote:
> On Wed, 4 Dec 2024 19:06:30 +0100
> Thierry Reding <thierry.reding@xxxxxxxxx> wrote:
>
> > On Wed, Dec 04, 2024 at 11:53:17AM -0500, Parker Newman wrote:
> > > On Wed, 4 Dec 2024 17:23:53 +0100
> > > Thierry Reding <thierry.reding@xxxxxxxxx> wrote:
> > >
> > > > On Tue, Nov 19, 2024 at 02:47:29PM -0500, Parker Newman wrote:
> > > > > On Tue, 19 Nov 2024 20:18:00 +0100
> > > > > Andrew Lunn <andrew@xxxxxxx> wrote:
> > > > >
> > > > > > > I think there is some confusion here. I will try to summarize:
> > > > > > > - Ihe iommu is supported by the Tegra SOC.
> > > > > > > - The way the mgbe driver is written the iommu DT property is REQUIRED.
> > > > > >
> > > > > > If it is required, please also include a patch to
> > > > > > nvidia,tegra234-mgbe.yaml and make iommus required.
> > > > > >
> > > > >
> > > > > I will add this when I submit a v2 of the patch.
> > > > >
> > > > > > > - "iommus" is a SOC DT property and is defined in tegra234.dtsi.
> > > > > > > - The mgbe device tree nodes in tegra234.dtsi DO have the iommus property.
> > > > > > > - There are no device tree changes required to to make this patch work.
> > > > > > > - This patch works fine with existing device trees.
> > > > > > >
> > > > > > > I will add the fallback however in case there is changes made to the iommu
> > > > > > > subsystem in the future.
> > > > > >
> > > > > > I would suggest you make iommus a required property and run the tests
> > > > > > over the existing .dts files.
> > > > > >
> > > > > > I looked at the history of tegra234.dtsi. The ethernet nodes were
> > > > > > added in:
> > > > > >
> > > > > > 610cdf3186bc604961bf04851e300deefd318038
> > > > > > Author: Thierry Reding <treding@xxxxxxxxxx>
> > > > > > Date: Thu Jul 7 09:48:15 2022 +0200
> > > > > >
> > > > > > arm64: tegra: Add MGBE nodes on Tegra234
> > > > > >
> > > > > > and the iommus property is present. So the requires is safe.
> > > > > >
> > > > > > Please expand the commit message. It is clear from all the questions
> > > > > > and backwards and forwards, it does not provide enough details.
> > > > > >
> > > > >
> > > > > I will add more details when I submit V2.
> > > > >
> > > > > > I just have one open issue. The code has been like this for over 2
> > > > > > years. Why has it only now started crashing?
> > > > > >
> > > > >
> > > > > It is rare for Nvidia Jetson users to use the mainline kernel. Nvidia
> > > > > provides a custom kernel package with many out of tree drivers including a
> > > > > driver for the mgbe controllers.
> > > > >
> > > > > Also, while the Orin AGX SOC (tegra234) has 4 instances of the mgbe controller,
> > > > > the Nvidia Orin AGX devkit only uses mgbe0. Connect Tech has carrier boards
> > > > > that use 2 or more of the mgbe controllers which is why we found the bug.
> > > >
> > > > Correct. Also, this was a really stupid thing that I overlooked. I don't
> > > > recall the exact circumstances, but I vaguely recall there had been
> > > > discussions about adding the tegra_dev_iommu_get_stream_id() helper
> > > > (that this patch uses) around the time that this driver was created. In
> > > > the midst of all of this I likely forgot to update the driver after the
> > > > discussions had settled.
> > > >
> > > > Anyway, I agree with the conclusion that we don't need a compatibility
> > > > fallback for this, both because it would be actively wrong to do it and
> > > > we've had the required IOMMU properties in device tree since the start,
> > > > so there can't be any regressions caused by this.
> > > >
> > > > I don't think it's necessary to make the iommus property required,
> > > > though, because there's technically no requirement for these devices to
> > > > be attached to an IOMMU. They usually are, and it's better if they are,
> > > > but they should be able to work correctly without an IOMMU.
> > > >
> > > Thanks for confirming from the Nvidia side! I wasn't sure if they would
> > > work without the iommu. That said, if you did NOT want to use the iommu
> > > and removed the iommu DT property then the probe will fail after my patch.
> > > Would we not need a guard around the writes to MGBE_WRAP_AXI_ASID0_CTRL as well?
> >
> > Well... frankly, I don't know. There's an override in the memory
> > controller which we set when a device is attached to an IOMMU. That's
> > usually how the stream ID is programmed. If we don't do that it will
> > typically default to a special passthrough stream ID (technically the
> > firmware can lock down those register and force them to a specific
> > stream ID all the time). I'm not sure what exactly the impact is of
> > these ASID registers (there's a few other instances where those are
> > needed). They are required if you want to use the IOMMU, but I don't
> > know what their meaning is if the IOMMU is not enabled.
> >
> > There's also different cases: the IOMMU can be disabled altogether, in
> > which case no page tables and such exist for any device and no
> > translation should happen whatsoever. But there's also the case where an
> > IOMMU is enabled, but certain devices shouldn't attach to them. I should
> > make it very clear that both of these are not recommended setups and I
> > don't know if they'll work. And they are mostly untested. I've been
> > meaning, but haven't found the time, to test some of these cases.
> >
>
> Yes I agree, all of those situations are very niche and not recommended for
> a Tegra platform that should always have the iommu enabled anyways. Adding an
> iommu bypass would probably be outside of the scope of this patch.
>
> I will not add a patch marking the iommu as required in the device tree
> bindings when I submit v2 unless anyone else feels different.

I was able to find a bit more information on this. Starting with
Tegra234 it's usually no longer possible to even enable bypass. It can
still be done, but it needs to be carefully coordinated between the
secure bootloader/firmware and the OS. Basically the secure firmware
can lock down the ability to bypass the IOMMU. If the firmware was
configured to allow bypass, the driver can then do so by writing the
special stream ID 0x7f into the stream ID register.

On these newer chips the memory controller override no longer has any
effect and writing the per-device stream ID registers is the only way to
attach to the IOMMU.

There's still the case where you can disable the IOMMU altogether, in
which case the IOMMU will still be bypassed, no matter what the firmware
did. My understanding is that it doesn't matter in those cases whether
we write the stream ID registers or not, they will simply get ignored.
With one exception perhaps being the bypass SID. If you write that, then
there's a protection mechanism that kicks in.

Well, after all this this still isn't entirely clear to me, but I think
what it means in a nutshell is that a) we'll want to keep the IOMMU
always on for security and because the firmware is by default configured
to not allow bypassing, b) IOMMU isn't strictly required because the
IOMMU might be completely disabled and c) we shouldn't need to
conditionalize the stream ID register writes.

That said, the tegra_dev_iommu_get_stream_id() function returns a bool
specifically to support the latter case. So the intention with the
design was that drivers would call that function and only need to write
the stream ID register if it returns true. That's not always great
because you may want (or need) to rewrite the register after suspend/
resume, in which case you probably want to resort to a cached value
rather than call that API. On the other hand the API is quite simple, so
it could serve as a cache as well.

Meh.

Thierry

Attachment: signature.asc
Description: PGP signature