Am Donnerstag, dem 26.08.2021 um 16:00 +0100 schrieb Robin Murphy:
On 2021-08-26 13:10, Michael Walle wrote:Not without a bigger rework. There's still quite a bit of midlayer
The DMA configuration of the virtual device is inherited from the first
actual etnaviv device. Unfortunately, this doesn't work with an IOMMU:
[ 5.191008] Failed to set up IOMMU for device (null); retaining platform DMA ops
This is because there is no associated iommu_group with the device. The
group is set in iommu_group_add_device() which is eventually called by
device_add() via the platform bus:
device_add()
blocking_notifier_call_chain()
iommu_bus_notifier()
iommu_probe_device()
__iommu_probe_device()
iommu_group_get_for_dev()
iommu_group_add_device()
Move of_dma_configure() into the probe function, which is called after
device_add(). Normally, the platform code will already call it itself
if .of_node is set. Unfortunately, this isn't the case here.
Also move the dma mask assignemnts to probe() to keep all DMA related
settings together.
I assume the driver must already keep track of the real GPU platform
device in order to map registers, request interrupts, etc. correctly -
can't it also correctly use that device for DMA API calls and avoid the
need for these shenanigans altogether?
issues in DRM, where dma-buf imports are dma-mapped and cached via the
virtual DRM device instead of the real GPU device. Also etnaviv is able
to coalesce multiple Vivante GPUs in a single system under one virtual
DRM device, which is used on i.MX6 where the 2D and 3D GPUs are
separate peripherals, but have the same DMA constraints.
Effectively we would need to handle N devices for the dma-mapping in a
lot of places instead of only dealing with the one virtual DRM device.
It would probably be the right thing to anyways, but it's not something
that can be changed short-term. I'm also not yet sure about the
performance implications, as we might run into some cache maintenance
bottlenecks if we dma synchronize buffers to multiple real device
instead of doing it a single time with the virtual DRM device. I know,
I know, this has a lot of assumptions baked in that could fall apart if
someone builds a SoC with multiple Vivante GPUs that have differing DMA
constraints, but up until now hardware designers have not been *that*
crazy, fortunately.