HI Peter,
We have tested with new patch and still seeing same issue, in this dumps we don't have debug traces, but seems there still exist race from code review , Can you please check it once:
Controller ThreadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CPUHP Thread
takedown_cpu
kthread_park
kthread_parkme
Set KTHREAD_SHOULD_PARK
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ smpboot_thread_fn
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ set Task interruptible
wake_up_process
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Kthread_parkme
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ SET TASK_PARKED
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ schedule
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ raw_spin_lock(&rq->lock)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ context_switch
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ finish_lock_switch
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Case TASK_PARKED
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kthread_park_complete
SET TASK_INTERRUPTIBLE
And also seeing the same warning during unpark of cpuhp from controller:
Âif (!wait_task_inactive(p, state)) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ WARN_ON(1);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return;
ÂÂÂÂÂÂÂ }
325.065893] [<ffffff8920ed0200>] kthread_unpark+0x80/0xd8
[Â 325.065902] [<ffffff8920eab754>] bringup_cpu+0xa0/0x12c
[Â 325.065910] [<ffffff8920eaae90>] cpuhp_invoke_callback+0xb4/0x5c8
[Â 325.065917] [<ffffff8920eabd98>] cpuhp_up_callbacks+0x3c/0x154
[Â 325.065924] [<ffffff8920ead220>] _cpu_up+0x134/0x208
[Â 325.065931] [<ffffff8920ead45c>] do_cpu_up+0x168/0x1a0
[Â 325.065938] [<ffffff8920ead4b8>] cpu_up+0x24/0x30
[Â 325.065948] [<ffffff89215b1408>] cpu_subsys_online+0x20/0x2c
[Â 325.065956] [<ffffff89215aac64>] device_online+0x70/0xb4
[Â 325.065962] [<ffffff89215aad78>] online_store+0xd0/0xdc
[Â 325.065971] [<ffffff89215a7424>] dev_attr_store+0x40/0x54
[Â 325.065982] [<ffffff89210d8a98>] sysfs_kf_write+0x5c/0x74
[Â 325.065988] [<ffffff89210d7b9c>] kernfs_fop_write+0xcc/0x1ec
[Â 325.065999] [<ffffff8921049288>] vfs_write+0xb4/0x1d0
[Â 325.066006] [<ffffff892104a858>] SyS_write+0x60/0xc0
[Â 325.066014] [<ffffff8920e83770>] el0_svc_naked+0x24/0x28
And after this same crash occured:
[Â 325.521307] [<ffffff8920ed4aac>] smpboot_thread_fn+0x26c/0x2c8
[Â 325.527295] [<ffffff8920ecfb24>] kthread+0xf4/0x108
I will put more debug ftraces to check what is going on exactly.
Regards
Gaurav