Re: 6.12 WARNING in netfs_consume_read_data()
From: Max Kellermann
Date: Fri Dec 06 2024 - 11:30:14 EST
On Fri, Dec 6, 2024 at 4:08 PM Max Kellermann <max.kellermann@xxxxxxxxx> wrote:
> Similar hangs wth 6.12.2 (vanilla without your "netfs-writeback" branch):
(Correction: this was 6.12.3, not 6.12.2)
I tried with 6.12.3 + dhowells/netfs-writeback; David's branch solved
many problems and it took much longer to trigger the hang, but it
eventually occurred:
INFO: task bash:6599 blocked for more than 122 seconds.
Not tainted 6.12.3-cm4all0-hp+ #298
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:bash state:D stack:0 pid:6599 tgid:6599
ppid:6598 flags:0x00000006
Call Trace:
<TASK>
__schedule+0xc34/0x4df0
? is_dynamic_key+0x120/0x150
? __pfx___schedule+0x10/0x10
? lock_release+0x206/0x660
? schedule+0x283/0x340
? __pfx_lock_release+0x10/0x10
? schedule+0x1e8/0x340
schedule+0xdc/0x340
schedule_preempt_disabled+0xa/0x10
rwsem_down_read_slowpath+0x6ba/0xd00
? __pfx_rwsem_down_read_slowpath+0x10/0x10
? kernel_text_address+0xb8/0x150
? lock_acquire+0x11f/0x290
? ceph_start_io_read+0x19/0x80
down_read+0xcd/0x220
? __pfx_down_read+0x10/0x10
? do_sys_openat2+0x106/0x160
? stack_trace_save+0x96/0xd0
? __pfx_stack_trace_save+0x10/0x10
ceph_start_io_read+0x19/0x80
ceph_read_iter+0x2e2/0xe70
? __pfx_ceph_read_iter+0x10/0x10
? psi_task_switch+0x256/0x810
? find_held_lock+0x2d/0x110
? lock_release+0x206/0x660
? finish_task_switch.isra.0+0x1db/0xa40
vfs_read+0x6e1/0xc40
? lock_acquire+0x11f/0x290
? finish_task_switch.isra.0+0x129/0xa40
? __pfx_vfs_read+0x10/0x10
? finish_task_switch.isra.0+0x225/0xa40
? fdget_pos+0x1b3/0x540
? __pfx___seccomp_filter+0x10/0x10
ksys_read+0xee/0x1c0
? __pfx_ksys_read+0x10/0x10
? lock_release+0x206/0x660
? __ceph_do_getattr+0xe8/0x380
do_syscall_64+0x64/0x100
? fdget_raw+0x53/0x390
? __do_sys_newfstatat+0x86/0xd0
? __pfx___do_sys_newfstatat+0x10/0x10
? syscall_exit_to_user_mode+0x57/0x120
? do_syscall_64+0x70/0x100
? irqentry_exit_to_user_mode+0x3d/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f71fb04f21d
RSP: 002b:00007ffdd516a918 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f71fb04f21d
RDX: 0000000000003003 RSI: 00005597dc55d4d0 RDI: 0000000000000003
RBP: 0000000000003003 R08: 00007f71fb12a020 R09: 00007f71fb12a020
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000001f4
R13: 00005597dc485340 R14: 00005597dc55d4d0 R15: 00005597be8d7524
</TASK>
INFO: task bash:6614 blocked for more than 122 seconds.
Not tainted 6.12.3-cm4all0-hp+ #298
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:bash state:D stack:0 pid:6614 tgid:6614
ppid:6613 flags:0x00000002
Call Trace:
<TASK>
__schedule+0xc34/0x4df0
? __free_insn_slot+0x370/0x3d0
? __pfx___schedule+0x10/0x10
? lock_release+0x206/0x660
? schedule+0x283/0x340
? __pfx_lock_release+0x10/0x10
? schedule+0x1e8/0x340
schedule+0xdc/0x340
schedule_preempt_disabled+0xa/0x10
rwsem_down_read_slowpath+0x6ba/0xd00
? __pfx_rwsem_down_read_slowpath+0x10/0x10
? kasan_save_stack+0x1c/0x40
? kasan_save_track+0x10/0x30
? lock_acquire+0x11f/0x290
? ceph_start_io_read+0x19/0x80
? find_held_lock+0x2d/0x110
down_read+0xcd/0x220
? __ceph_caps_issued_mask+0x416/0xa10
? __pfx_down_read+0x10/0x10
ceph_start_io_read+0x19/0x80
ceph_read_iter+0x2e2/0xe70
? _copy_to_user+0x50/0x70
? __pfx_ceph_read_iter+0x10/0x10
? fdget_raw+0x53/0x390
vfs_read+0x6e1/0xc40
? __do_sys_newfstatat+0x86/0xd0
? __pfx___do_sys_newfstatat+0x10/0x10
? __pfx_vfs_read+0x10/0x10
? fdget_pos+0x1b3/0x540
? __pfx___seccomp_filter+0x10/0x10
ksys_read+0xee/0x1c0
? __pfx_ksys_read+0x10/0x10
do_syscall_64+0x64/0x100
? do_user_addr_fault+0x401/0x8f0
? find_held_lock+0x59/0x110
? find_held_lock+0x2d/0x110
? lock_release+0x206/0x660
? do_user_addr_fault+0x45e/0x8f0
? __pfx_lock_release+0x10/0x10
? do_user_addr_fault+0x401/0x8f0
? do_user_addr_fault+0x463/0x8f0
? irqentry_exit_to_user_mode+0x3d/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa84a43a21d
RSP: 002b:00007ffec8720278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa84a43a21d
RDX: 0000000000003003 RSI: 0000565007f224d0 RDI: 0000000000000003
RBP: 0000000000003003 R08: 00007fa84a515020 R09: 00007fa84a515020
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000001f4
R13: 0000565007e4a340 R14: 0000565007f224d0 R15: 0000565005ba8524
</TASK>
Showing all locks held in the system:
1 lock held by khungtaskd/163:
#0: ffffffffae629b80 (rcu_read_lock){....}-{1:2}, at:
debug_show_all_locks+0x64/0x280
2 locks held by bash/3365:
#0: ffff8881661803e0 (sb_writers#19){....}-{0:0}, at: ksys_write+0xee/0x1c0
#1: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at:
ceph_start_io_write+0x15/0x30
1 lock held by bash/6599:
#0: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at:
ceph_start_io_read+0x19/0x80
1 lock held by bash/6614:
#0: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at:
ceph_start_io_read+0x19/0x80
=============================================