Re: [PATCH v4 4/4] thermal: core: Add notifications call in the framework
From: Marek Szyprowski
Date: Tue Jul 07 2020 - 05:15:28 EST
Hi Daniel,
On 06.07.2020 15:46, Daniel Lezcano wrote:
> On 06/07/2020 15:17, Marek Szyprowski wrote:
>> On 06.07.2020 12:55, Daniel Lezcano wrote:
>>> The generic netlink protocol is implemented but the different
>>> notification functions are not yet connected to the core code.
>>>
>>> These changes add the notification calls in the different
>>> corresponding places.
>>>
>>> Reviewed-by: Amit Kucheria <amit.kucheria@xxxxxxxxxx>
>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
>> This patch landed in today's linux-next 20200706 as commit 5df786e46560
>> ("thermal: core: Add notifications call in the framework"). Sadly it
>> breaks booting various Samsung Exynos based boards. Here is an example
>> log from Odroid U3 board:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000010
>> pgd = (ptrval)
>> [00000010] *pgd=00000000
>> Internal error: Oops: 5 [#1] PREEMPT SMP ARM
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-00015-g5df786e46560
>> #1146
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> PC is at kmem_cache_alloc+0x13c/0x418
>> LR is at kmem_cache_alloc+0x48/0x418
>> pc : [<c02b5cac>]ÂÂÂ lr : [<c02b5bb8>]ÂÂÂ psr: 20000053
>> ...
>> Flags: nzCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none
>> Control: 10c5387d Table: 4000404a DAC: 00000051
>> Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
>> Stack: (0xee8f1cf8 to 0xee8f2000)
>> ...
>> [<c02b5cac>] (kmem_cache_alloc) from [<c08cd170>] (__alloc_skb+0x5c/0x170)
>> [<c08cd170>] (__alloc_skb) from [<c07ec19c>]
>> (thermal_genl_send_event+0x24/0x174)
>> [<c07ec19c>] (thermal_genl_send_event) from [<c07ec648>]
>> (thermal_notify_tz_create+0x58/0x74)
>> [<c07ec648>] (thermal_notify_tz_create) from [<c07e9058>]
>> (thermal_zone_device_register+0x358/0x650)
>> [<c07e9058>] (thermal_zone_device_register) from [<c1028d34>]
>> (of_parse_thermal_zones+0x304/0x7a4)
>> [<c1028d34>] (of_parse_thermal_zones) from [<c1028964>]
>> (thermal_init+0xdc/0x154)
>> [<c1028964>] (thermal_init) from [<c0102378>] (do_one_initcall+0x8c/0x424)
>> [<c0102378>] (do_one_initcall) from [<c1001158>]
>> (kernel_init_freeable+0x190/0x204)
>> [<c1001158>] (kernel_init_freeable) from [<c0ab85f4>]
>> (kernel_init+0x8/0x118)
>> [<c0ab85f4>] (kernel_init) from [<c0100114>] (ret_from_fork+0x14/0x20)
>>
>> Reverting it on top of linux-next fixes the boot issue. I will
>> investigate it further soon.
> Thanks for reporting this.
>
> Can you send the addr2line result and code it points to ?
addr2line of c02b5cac (kmem_cache_alloc+0x13c/0x418) points to mm/slub.c
+2839, but I'm not sure if we can trust it. imho it looks like some
trashed memory somewhere, but I don't have time right now to analyze it
further now...
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland