Re: [PATCH 0/4] Add LVTS support for mt8192

From: Balsam CHIHI
Date: Thu Mar 09 2023 - 05:48:32 EST


Hi Chen-Yu,

On Thu, Mar 9, 2023 at 6:04 AM Chen-Yu Tsai <wenst@xxxxxxxxxxxx> wrote:
>
> On Wed, Mar 8, 2023 at 12:34 AM <bchihi@xxxxxxxxxxxx> wrote:
> >
> > From: Balsam CHIHI <bchihi@xxxxxxxxxxxx>
> >
> > Add full LVTS support (MCU thermal domain + AP thermal domain) to MediaTek MT8192 SoC.
> >
> > This series is a continuation of the previous series "Add LVTS Thermal Architecture" v14 :
> > https://patchwork.kernel.org/project/linux-pm/cover/20230209105628.50294-1-bchihi@xxxxxxxxxxxx/
> > and "Add LVTS's AP thermal domain support for mt8195" :
> > https://patchwork.kernel.org/project/linux-pm/cover/20230307154524.118541-1-bchihi@xxxxxxxxxxxx/
> >
> > Based on top of thermal/linux-next :
> > base-commit: 6828e402d06f7c574430b61c05db784cd847b19f
> >
> > Depends on these patches as they are not yet applyied to thermal/linux-next branch :
> > [1/4] dt-bindings: thermal: mediatek: Add AP domain to LVTS thermal controllers for mt8195
> > https://patchwork.kernel.org/project/linux-pm/patch/20230307154524.118541-2-bchihi@xxxxxxxxxxxx/
> > [2/4] thermal/drivers/mediatek/lvts_thermal: Add AP domain for mt8195
> > https://patchwork.kernel.org/project/linux-pm/patch/20230307154524.118541-3-bchihi@xxxxxxxxxxxx/
> >
> > Balsam CHIHI (4):
> > dt-bindings: thermal: mediatek: Add LVTS thermal controller definition
> > for mt8192
> > thermal/drivers/mediatek/lvts_thermal: Add mt8192 support
> > arm64: dts: mediatek: mt8192: Add thermal zones and thermal nodes
> > arm64: dts: mediatek: mt8192: Add temperature mitigation threshold
>
> I tried this on my Hayato. As soon as lvts_ap probes and its thermal zones
> are registered, a "critical temperature reached" warning is immediately
> triggered for all the zones, a reboot is forced. A NULL pointer dereference
> is also triggered somewhere. I filtered out all the interspersed "critical
> temperature" messages:
>

Thank you very much for testing!
It seems like interrupts on mt8192 and mt8195 do not behave the same way.
I am investigating the issues.

> [ 2.943847] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000600
> [ 2.958818] Mem abort info:
> [ 2.965996] ESR = 0x0000000096000005
> [ 2.973765] SMCCC: SOC_ID: ID = jep106:0426:8192 Revision = 0x00000000
> [ 2.975442] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 2.987305] SET = 0, FnV = 0
> [ 2.995521] EA = 0, S1PTW = 0
> [ 3.004265] FSC = 0x05: level 1 translation fault
> [ 3.014365] Data abort info:
> [ 3.017344] ISV = 0, ISS = 0x00000005
> [ 3.021279] CM = 0, WnR = 0
> [ 3.022124] GACT probability NOT on
> [ 3.024277] [0000000000000600] user address but active_mm is swapper
> [ 3.034190] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> [ 3.044738] Modules linked in:
> [ 3.044745] CPU: 0 PID: 97 Comm: irq/273-1100b00 Not tainted
> 6.3.0-rc1-next-20230308-01996-g3c0b9a61a3e5-dirty #575
> c7b94096b594a95f18217c2ad4a2bd6d2c431108
> [ 3.044751] Hardware name: Google Hayato rev1 (DT)
> [ 3.044755] pstate: 60000009 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 3.052055] pc : __mutex_lock+0x60/0x438
> [ 3.052066] lr : __mutex_lock+0x54/0x438
> [ 3.052070] sp : ffffffc008883c60
> [ 3.070822] x29: ffffffc008883c60 x28: ffffff80c281a880 x27: 000881f00009001f
> [ 3.070830] x26: 1fc0000000247c00 x25: ffffff80c281a900 x24: 0000000000000000
> [ 3.070837] x23: 0000000000000000 x22: ffffffe5ae5d45f4 x21: 0000000000000002
> [ 3.086211] x20: 0000000000000000 x19: 00000000000005a0 x18: ffffffffffffffff
> [ 3.086218] x17: 6568636165722065 x16: 727574617265706d x15: 0000000000000028
> [ 3.097773] x14: 0000000000000000 x13: 0000000000003395 x12: ffffffe5af7f6ff0
> [ 3.097780] x11: 65706d655428206e x10: 0000000000000000 x9 : ffffffe5adcf4b08
> [ 3.097787] x8 : ffffffe5afe03230 x7 : 00000000000261b0 x6 : ffffff80c2b86600
> [ 3.105609] x5 : 0000000000000000 x4 : ffffff80c2b86600 x3 : 0000000000000000
> [ 3.112565] x2 : ffffff9b505f6000 x1 : 0000000000000000 x0 : 0000000000000000
> [ 3.127593] Call trace:
> [ 3.127595] __mutex_lock+0x60/0x438
> [ 3.127600] mutex_lock_nested+0x34/0x48
> [ 3.141844] thermal_zone_device_update+0x34/0x80
> [ 3.152879] lvts_irq_handler+0xbc/0x158
> [ 3.152886] irq_thread_fn+0x34/0xb8
> [ 3.161489] irq_thread+0x19c/0x298
> [ 3.161494] kthread+0x11c/0x128
> [ 3.175152] ret_from_fork+0x10/0x20
> [ 3.175163] Code: 97ccbb7c 9000bea0 b9411400 35000080 (f9403260)
> [ 3.189402] ---[ end trace 0000000000000000 ]---
> [ 3.193417] Kernel panic - not syncing: Oops: Fatal exception
> [ 3.201255] Kernel Offset: 0x25a5c00000 from 0xffffffc008000000
> [ 3.201257] PHYS_OFFSET: 0x40000000
> [ 3.201259] CPU features: 0x600000,01700506,3200720b
> [ 3.201263] Memory Limit: none
> [ 3.376838] Rebooting in 30 seconds..
>
>
[...]

Best regards,
Balsam