profiling: Enabling kernel sleep profile lockups the system

From: Tetsuo Handa
Date: Sat Jul 27 2024 - 08:31:04 EST


The lock in get_wchan() was introduced by commit 42a20f86dc19 ("sched: Add
wrapper for get_wchan() to keep task blocked"). That is, presumably
kernel sleep profiling is no longer functional since Linux 5.16...

On 2024/07/17 19:08, Tetsuo Handa wrote:
(...snipped...)
>
> By the way, there remains a bug which this pull request does not address.
> Either booting with
>
> profile=sleep
>
> kernel command line option added or executing
>
> # echo -n sleep > /sys/kernel/profiling
>
> after boot causes the system to lock up.
>
> I've reported this problem at
> https://lkml.kernel.org/r/60011cc3-b9fc-49c8-b0c0-35aafe46f313@xxxxxxxxxxxxxxxxxxx
> but nobody was interested in. Can somebody test this problem using real hardware?
> If this problem happens with real hardware, maybe nobody is using "kernel sleep
> profiling" and we can remove "kernel sleep profiling" functionality.
>
>
>
> Booting with profile=sleep added:
> ----------------------------------------
> [ 7.254439] kernel profiling enabled schedstats, disable via kernel.sched_schedstats.
> [ 7.254514] kernel sleep profiling enabled (shift: 0)
> (...snipped...)
> [ 13.334555] smpboot: CPU0: 12th Gen Intel(R) Core(TM) i7-1255U (family: 0x6, model: 0x9a, stepping: 0x4)
> [ 13.337444]
> [ 13.337857] ============================================
> [ 13.337857] WARNING: possible recursive locking detected
> [ 13.337857] 6.10.0 #95 Not tainted
> [ 13.337857] --------------------------------------------
> [ 13.337857] swapper/0/1 is trying to acquire lock:
> [ 13.337857] ffff8a1784955d38 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x80
> [ 13.337857]
> [ 13.337857] but task is already holding lock:
> [ 13.337857] ffff8a1784955d38 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x54/0x5f0
> [ 13.337857]
> [ 13.337857] other info that might help us debug this:
> [ 13.337857] Possible unsafe locking scenario:
> [ 13.337857]
> [ 13.337857] CPU0
> [ 13.337857] ----
> [ 13.337857] lock(&p->pi_lock);
> [ 13.337857] lock(&p->pi_lock);
> [ 13.337857]
> [ 13.337857] *** DEADLOCK ***
> [ 13.337857]
> [ 13.337857] May be due to missing lock nesting notation
> [ 13.337857]
> [ 13.337857] 2 locks held by swapper/0/1:
> [ 13.337857] #0: ffff8a1784955d38 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x54/0x5f0
> [ 13.337857] #1: ffff8a1819ff7dd8 (&rq->__lock){-.-.}-{2:2}, at: try_to_wake_up+0x16c/0x5f0
> [ 13.337857]
> [ 13.337857] stack backtrace:
> [ 13.337857] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0 #95
> [ 13.337857] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> [ 13.337857] Call Trace:
> [ 13.337857] <TASK>
> [ 13.337857] dump_stack_lvl+0x79/0xb0
> [ 13.337857] validate_chain+0x300/0x4c0
> [ 13.337857] __lock_acquire+0x51f/0xb90
> [ 13.337857] lock_acquire.part.0+0x6e/0x1c0
> [ 13.337857] ? get_wchan+0x32/0x80
> [ 13.337857] ? lock_acquire+0xf4/0x130
> [ 13.337857] ? get_wchan+0x32/0x80
> [ 13.337857] __raw_spin_lock_irq+0x40/0x90
> [ 13.337857] ? get_wchan+0x32/0x80
> [ 13.337857] get_wchan+0x32/0x80
> [ 13.337857] __update_stats_enqueue_sleeper+0x153/0x380
> [ 13.337857] enqueue_entity+0x42b/0x580
> [ 13.337857] enqueue_task_fair+0x97/0x3d0
> [ 13.337857] activate_task+0x4e/0xe0
> [ 13.337857] ttwu_do_activate+0x5d/0x2a0
> [ 13.337857] try_to_wake_up+0x1cf/0x5f0
> [ 13.337857] ? __wait_for_common+0x4d/0x220
> [ 13.337857] __kthread_create_worker+0xab/0x100
> [ 13.337857] kthread_create_worker+0x61/0x80
> [ 13.337857] workqueue_init+0x1d/0x390
> [ 13.337857] kernel_init_freeable+0xc4/0x230
> [ 13.337857] ? __pfx_kernel_init+0x10/0x10
> [ 13.337857] kernel_init+0x1a/0x1d0
> [ 13.337857] ret_from_fork+0x31/0x50
> [ 13.337857] ? __pfx_kernel_init+0x10/0x10
> [ 13.337857] ret_from_fork_asm+0x1a/0x30
> [ 13.337857] </TASK>
> ----------------------------------------
>
> Writing to /sys/kernel/profiling after boot:
> ----------------------------------------
> [ 69.085956] kernel profiling enabled schedstats, disable via kernel.sched_schedstats.
> [ 69.096289] kernel sleep profiling enabled (shift: 0)
> [ 69.102141]
> [ 69.105003] ======================================================
> [ 69.111960] WARNING: possible circular locking dependency detected
> [ 69.118765] 6.10.0 #95 Not tainted
> [ 69.122478] ------------------------------------------------------
> [ 69.128995] swapper/2/0 is trying to acquire lock:
> [ 69.134270] ffff94bac09cb5b8 (&p->pi_lock){-.-.}-{2:2}, at: get_wchan+0x32/0x80
> [ 69.141938]
> [ 69.141938] but task is already holding lock:
> [ 69.148104] ffff94bb5a7f7dd8 (&rq->__lock){-.-.}-{2:2}, at: _raw_spin_rq_lock_irqsave+0x2c/0x50
> [ 69.157339]
> [ 69.157339] which lock already depends on the new lock.
> [ 69.157339]
> [ 69.165841]
> [ 69.165841] the existing dependency chain (in reverse order) is:
> [ 69.173805]
> [ 69.173805] -> #1 (&rq->__lock){-.-.}-{2:2}:
> [ 69.180186] __lock_acquire+0x51f/0xb90
> [ 69.184799] lock_acquire.part.0+0x6e/0x1c0
> [ 69.190031] _raw_spin_lock_nested+0x33/0x80
> [ 69.195285] raw_spin_rq_lock_nested+0x15/0x30
> [ 69.200854] task_fork_fair+0x3e/0xe0
> [ 69.206084] sched_cgroup_fork+0x11e/0x160
> [ 69.211425] copy_process+0x12d6/0x1c70
> [ 69.216385] kernel_clone+0x9e/0x390
> [ 69.220996] user_mode_thread+0x5f/0x90
> [ 69.225841] rest_init+0x1e/0x190
> [ 69.230291] start_kernel+0x5d6/0x6a0
> [ 69.235052] x86_64_start_reservations+0x18/0x30
> [ 69.240660] x86_64_start_kernel+0x92/0xa0
> [ 69.245837] common_startup_64+0x13e/0x141
> [ 69.250905]
> [ 69.250905] -> #0 (&p->pi_lock){-.-.}-{2:2}:
> [ 69.257221] check_prev_add+0xf1/0xce0
> [ 69.262034] validate_chain+0x406/0x4c0
> [ 69.266893] __lock_acquire+0x51f/0xb90
> [ 69.271676] lock_acquire.part.0+0x6e/0x1c0
> [ 69.276901] __raw_spin_lock_irq+0x40/0x90
> [ 69.281950] get_wchan+0x32/0x80
> [ 69.286109] __update_stats_enqueue_sleeper+0x153/0x380
> [ 69.292413] enqueue_entity+0x42b/0x580
> [ 69.297274] enqueue_task_fair+0x97/0x3d0
> [ 69.302301] activate_task+0x4e/0xe0
> [ 69.306790] ttwu_do_activate+0x5d/0x2a0
> [ 69.311689] sched_ttwu_pending+0xe7/0x1a0
> [ 69.316756] __flush_smp_call_function_queue+0x1d9/0x690
> [ 69.323108] __sysvec_call_function_single+0x3c/0x150
> [ 69.329181] sysvec_call_function_single+0x68/0x90
> [ 69.334953] asm_sysvec_call_function_single+0x1a/0x20
> [ 69.341089] pv_native_safe_halt+0xf/0x20
> [ 69.346141] acpi_safe_halt+0x14/0x20
> [ 69.350757] acpi_idle_enter+0x83/0xd0
> [ 69.355541] cpuidle_enter_state+0xcb/0x550
> [ 69.360770] cpuidle_enter+0x2d/0x40
> [ 69.365292] cpuidle_idle_call+0xf4/0x160
> [ 69.370263] do_idle+0x96/0xf0
> [ 69.374282] cpu_startup_entry+0x29/0x30
> [ 69.379178] start_secondary+0x121/0x140
> [ 69.384062] common_startup_64+0x13e/0x141
> [ 69.389210]
> [ 69.389210] other info that might help us debug this:
> [ 69.389210]
> [ 69.397778] Possible unsafe locking scenario:
> [ 69.397778]
> [ 69.404212] CPU0 CPU1
> [ 69.409217] ---- ----
> [ 69.414230] lock(&rq->__lock);
> [ 69.417973] lock(&p->pi_lock);
> [ 69.424200] lock(&rq->__lock);
> [ 69.430373] lock(&p->pi_lock);
> [ 69.434115]
> [ 69.434115] *** DEADLOCK ***
> [ 69.434115]
> [ 69.440608] 1 lock held by swapper/2/0:
> [ 69.444875] #0: ffff94bb5a7f7dd8 (&rq->__lock){-.-.}-{2:2}, at: _raw_spin_rq_lock_irqsave+0x2c/0x50
> [ 69.454596]
> [ 69.454596] stack backtrace:
> [ 69.459437] CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.10.0 #95
> [ 69.467336] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> [ 69.478770] Call Trace:
> [ 69.481622] <IRQ>
> [ 69.484047] dump_stack_lvl+0x79/0xb0
> [ 69.488303] check_noncircular+0x133/0x150
> [ 69.492874] ? lock_is_held_type+0xaf/0x130
> [ 69.497547] ? update_sd_lb_stats.constprop.0+0x284/0x2d0
> [ 69.503424] check_prev_add+0xf1/0xce0
> [ 69.507613] ? add_chain_cache+0x111/0x310
> [ 69.512100] validate_chain+0x406/0x4c0
> [ 69.516327] __lock_acquire+0x51f/0xb90
> [ 69.520523] lock_acquire.part.0+0x6e/0x1c0
> [ 69.525068] ? get_wchan+0x32/0x80
> [ 69.528858] ? lock_acquire+0xf4/0x130
> [ 69.533053] ? get_wchan+0x32/0x80
> [ 69.536923] __raw_spin_lock_irq+0x40/0x90
> [ 69.541444] ? get_wchan+0x32/0x80
> [ 69.545240] get_wchan+0x32/0x80
> [ 69.548942] __update_stats_enqueue_sleeper+0x153/0x380
> [ 69.554443] enqueue_entity+0x42b/0x580
> [ 69.558693] enqueue_task_fair+0x97/0x3d0
> [ 69.563127] activate_task+0x4e/0xe0
> [ 69.567123] ttwu_do_activate+0x5d/0x2a0
> [ 69.571500] sched_ttwu_pending+0xe7/0x1a0
> [ 69.576099] __flush_smp_call_function_queue+0x1d9/0x690
> [ 69.581823] __sysvec_call_function_single+0x3c/0x150
> [ 69.587257] sysvec_call_function_single+0x68/0x90
> [ 69.592491] </IRQ>
> [ 69.595084] <TASK>
> [ 69.597586] asm_sysvec_call_function_single+0x1a/0x20
> [ 69.603105] RIP: 0010:pv_native_safe_halt+0xf/0x20
> [ 69.608395] Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 65 3f 3e 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
> [ 69.627892] RSP: 0018:ffffb58b000efe50 EFLAGS: 00000246
> [ 69.633594] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0000000000000000
> [ 69.641202] RDX: ffff94bb5a600000 RSI: ffff94bb79020800 RDI: ffff94bb79020864
> [ 69.648787] RBP: ffff94bb79020864 R08: ffffffff96b63420 R09: 0000000000000000
> [ 69.656489] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
> [ 69.663706] R13: ffffffff96b634a0 R14: 0000000000000001 R15: 00000000001f6a0c
> [ 69.671313] acpi_safe_halt+0x14/0x20
> [ 69.675411] acpi_idle_enter+0x83/0xd0
> [ 69.679550] cpuidle_enter_state+0xcb/0x550
> [ 69.684224] cpuidle_enter+0x2d/0x40
> [ 69.688203] cpuidle_idle_call+0xf4/0x160
> [ 69.692204] do_idle+0x96/0xf0
> [ 69.695673] cpu_startup_entry+0x29/0x30
> [ 69.699998] start_secondary+0x121/0x140
> [ 69.704366] common_startup_64+0x13e/0x141
> [ 69.708374] </TASK>
> ----------------------------------------