Re: [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path
From: Marek Szyprowski
Date: Mon Mar 17 2025 - 03:37:28 EST
On 13.03.2025 15:12, Robin Murphy wrote:
> On 2025-03-13 1:06 pm, Robin Murphy wrote:
>> On 2025-03-13 12:23 pm, Marek Szyprowski wrote:
>>> On 13.03.2025 12:01, Robin Murphy wrote:
>>>> On 2025-03-13 9:56 am, Marek Szyprowski wrote:
>>>> [...]
>>>>> This patch landed in yesterday's linux-next as commit bcb81ac6ae3c
>>>>> ("iommu: Get DT/ACPI parsing into the proper probe path"). In my
>>>>> tests I
>>>>> found it breaks booting of ARM64 RK3568-based Odroid-M1 board
>>>>> (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts). Here is the
>>>>> relevant kernel log:
>>>>
>>>> ...and the bug-flushing-out begins!
>>>>
>>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>>> 00000000000003e8
>>>>> Mem abort info:
>>>>> ESR = 0x0000000096000004
>>>>> EC = 0x25: DABT (current EL), IL = 32 bits
>>>>> SET = 0, FnV = 0
>>>>> EA = 0, S1PTW = 0
>>>>> FSC = 0x04: level 0 translation fault
>>>>> Data abort info:
>>>>> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
>>>>> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>>>> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>>>> [00000000000003e8] user address but active_mm is swapper
>>>>> Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>>>>> Modules linked in:
>>>>> CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc3+ #15533
>>>>> Hardware name: Hardkernel ODROID-M1 (DT)
>>>>> pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>> pc : devm_kmalloc+0x2c/0x114
>>>>> lr : rk_iommu_of_xlate+0x30/0x90
>>>>> ...
>>>>> Call trace:
>>>>> devm_kmalloc+0x2c/0x114 (P)
>>>>> rk_iommu_of_xlate+0x30/0x90
>>>>
>>>> Yeah, looks like this is doing something a bit questionable which
>>>> can't
>>>> work properly. TBH the whole dma_dev thing could probably be
>>>> cleaned up
>>>> now that we have proper instances, but for now does this work?
>>>
>>> Yes, this patch fixes the problem I've observed.
>>>
>>> Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
>>> Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
>>>
>>> BTW, this dma_dev idea has been borrowed from my exynos_iommu driver
>>> and
>>> I doubt it can be cleaned up.
>>
>> On the contrary I suspect they both can - it all dates back to when
>> we had the single global platform bus iommu_ops and the SoC drivers
>> were forced to bodge their own notion of multiple instances, but with
>> the modern core code, ops are always called via a valid IOMMU
>> instance or domain, so in principle it should always be possible to
>> get at an appropriate IOMMU device now. IIRC it was mostly about
>> allocating and DMA-mapping the pagetables in domain_alloc, where the
>> private notion of instances didn't have enough information, but
>> domain_alloc_paging solves that.
>
> Bah, in fact I think I am going to have to do that now, since although
> it doesn't crash, rk_domain_alloc_paging() will also be failing for
> the same reason. Time to find a PSU for the RK3399 board, I guess...
>
> (Or maybe just move the dma_dev assignment earlier to match Exynos?)
Well I just found that Exynos IOMMU is also broken on some on my test
boards. It looks that the runtime pm links are somehow not correctly
established. I will try to analyze this later in the afternoon.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland