Corrected the formatting, Sorry for spam.
HI Peter,
We have tested with new patch and still seeing same issue, in this dumps we don't have debug traces, but seems there still exist race from code review , Can you please check it once:
Controller ThreadÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CPUHP Thread
takedown_cpu
kthread_park
kthread_parkme
Set KTHREAD_SHOULD_PARK
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ smpboot_thread_fn
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ set Task interruptible
wake_up_process
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Kthread_parkme
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ SET TASK_PARKED
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ schedule
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ raw_spin_lock(&rq->lock)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ context_switch
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ finish_lock_switch
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Case TASK_PARKED
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kthread_park_complete
SET TASK_INTERRUPTIBLE
And also seeing the same warning during unpark of cpuhp from controller:
ÂÂif (!wait_task_inactive(p, state)) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ WARN_ON(1);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return;
ÂÂÂÂÂÂÂÂ }
325.065893] [<ffffff8920ed0200>] kthread_unpark+0x80/0xd8
[Â 325.065902] [<ffffff8920eab754>] bringup_cpu+0xa0/0x12c
[Â 325.065910] [<ffffff8920eaae90>] cpuhp_invoke_callback+0xb4/0x5c8
[Â 325.065917] [<ffffff8920eabd98>] cpuhp_up_callbacks+0x3c/0x154
[Â 325.065924] [<ffffff8920ead220>] _cpu_up+0x134/0x208
[Â 325.065931] [<ffffff8920ead45c>] do_cpu_up+0x168/0x1a0
[Â 325.065938] [<ffffff8920ead4b8>] cpu_up+0x24/0x30
[Â 325.065948] [<ffffff89215b1408>] cpu_subsys_online+0x20/0x2c
[Â 325.065956] [<ffffff89215aac64>] device_online+0x70/0xb4
[Â 325.065962] [<ffffff89215aad78>] online_store+0xd0/0xdc
[Â 325.065971] [<ffffff89215a7424>] dev_attr_store+0x40/0x54
[Â 325.065982] [<ffffff89210d8a98>] sysfs_kf_write+0x5c/0x74
[Â 325.065988] [<ffffff89210d7b9c>] kernfs_fop_write+0xcc/0x1ec
[Â 325.065999] [<ffffff8921049288>] vfs_write+0xb4/0x1d0
[Â 325.066006] [<ffffff892104a858>] SyS_write+0x60/0xc0
[Â 325.066014] [<ffffff8920e83770>] el0_svc_naked+0x24/0x28
And after this same crash occured:
[Â 325.521307] [<ffffff8920ed4aac>] smpboot_thread_fn+0x26c/0x2c8
[Â 325.527295] [<ffffff8920ecfb24>] kthread+0xf4/0x108
I will put more debug ftraces to check what is going on exactly.
Regards
Gaurav