Re: [PATCH v4 05/16] iommu: Move bus setup to IOMMU device registration

From: Dmitry Osipenko
Date: Thu Oct 06 2022 - 14:43:26 EST


On 10/6/22 20:12, Thierry Reding wrote:
> On Thu, Oct 06, 2022 at 04:27:39PM +0100, Robin Murphy wrote:
>> On 2022-10-06 15:01, Jon Hunter wrote:
>>> Hi Robin,
>>>
>>> On 15/08/2022 17:20, Robin Murphy wrote:
>>>> Move the bus setup to iommu_device_register(). This should allow
>>>> bus_iommu_probe() to be correctly replayed for multiple IOMMU instances,
>>>> and leaves bus_set_iommu() as a glorified no-op to be cleaned up next.
>>>>
>>>> At this point we can also handle cleanup better than just rolling back
>>>> the most-recently-touched bus upon failure - which may release devices
>>>> owned by other already-registered instances, and still leave devices on
>>>> other buses with dangling pointers to the failed instance. Now it's easy
>>>> to clean up the exact footprint of a given instance, no more, no less.
>>>
>>>
>>> Since this change, I have noticed that the DRM driver on Tegra20 is
>>> failing to probe and I am seeing ...
>>>
>>>  tegra-gr2d 54140000.gr2d: failed to attach to domain: -19
>>>  drm drm: failed to initialize 54140000.gr2d: -19

The upstream Tegra20 device-tree doesn't have IOMMU phandle for
54140000.gr2d. In this case IOMMU domain shouldn't be available for the
DRM driver [1]. Sounds like IOMMU core has a bug.

[1]
https://elixir.bootlin.com/linux/latest/source/drivers/iommu/tegra-gart.c#L243

>>> Bisect points to this change and reverting it fixes it. Let me know if
>>> you have any thoughts.
>>
>> Oh, apparently what's happened is that I've inadvertently enabled the
>> tegra-gart driver, since it seems that *wasn't* calling bus_set_iommu()
>> before. Looking at the history, it appears to have been that way since
>> c7e3ca515e78 ("iommu/tegra: gart: Do not register with bus"), so essentially
>> that driver has been broken and useless for close to 8 years now :(
>>
>> Given that, I'd be inclined to "fix" it as below, or just give up and delete
>> the whole thing.
>
> I'm inclined to agree. GART is severely limited: it provides a single
> IOMMU domain with an aperture of 32 MiB. It's close to useless for
> anything we would want to do and my understanding is that people have
> been falling back to CMA for any graphics/display stuff that the GART
> would've been useful for.
>
> Given that nobody's felt the urge to fix this for the past 8 years, I
> don't think there's enough interest in this to keep it going.
>
> Dmitry, any thoughts?

This GART driver is used by a community kernel fork that has alternative
DRM driver supporting IOMMU/GART on Tegra20. The fork is periodically
synced with the latest upstream, it's used by postmarketOS. Hence it
wasn't a completely dead driver.

The 32M aperture works well for 2d/3d engines because it fits multiple
textures at once. Tegra DRM driver needs to remap buffers dynamically,
but this is easy to implement because DRM core has nice helpers for
that. We haven't got to the point where upstream DRM driver is ready to
support this feature.

CMA is hard to use for anything other than display framebuffers. It's
slow and fails to allocate memory if CMA area is "shared" due to
fragmentation and pinned pages. Reserved CMA isn't an option for GPU
because then there is no memory for the rest of system.

I don't see any problems with removing GART driver. It's not going to be
used soon in upstream and only adds maintenance burden. We can always
re-add it in the future.

--
Best regards,
Dmitry