RE: [PATCH] ceph: add timeout protection to ceph_mdsc_sync() path
From: Viacheslav Dubeyko
Date: Tue Feb 17 2026 - 16:54:07 EST
On Fri, 2026-02-13 at 09:51 +0200, Ionut Nechita (Wind River) wrote:
> I also created a tracker issue for this on the Ceph bug tracker:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__tracker.ceph.com_issues_74897&d=DwIDaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=q5bIm4AXMzc8NJu1_RGmnQ2fMWKq4Y4RAkElvUgSs00&m=oVQ3XXnnOXYdQh1XLw3tF7NQtVn2RbspKR87xKMX9OaXwxMeG5-j9NZql6OVPhi1&s=RCmrpV6SMVfjurhivjMyHRm_bDekVEQl_uIhD5hbtno&e=
>
It looks like that I was able to reproduce the symptoms of the issue by multiple
runs of generic/013 xfstests' test-case:
#!/bin/bash
while true; do
sudo ./check generic/013
done
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.346895] INFO: task fsstress:14466
blocked for more than 122 seconds.
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.347995] Not tainted 6.19.0-rc8+
#10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.348530] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349426] task:fsstress state:D
stack:0 pid:14466 tgid:14466 ppid:14464 task
_flags:0x400140 flags:0x00080800
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349438] Call Trace:
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349441] <TASK>
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349445] __schedule+0xe8a/0x57f0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349457] ? kasan_save_stack+0x39/0x60
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349466] ? kasan_save_stack+0x26/0x60
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349471] ? kasan_save_track+0x14/0x40
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349475] ?
kasan_save_free_info+0x3b/0x60
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349485] ? __kasan_slab_free+0x7a/0xb0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349489] ?
ceph_mdsc_release_request+0x6a3/0x880
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349497] ?
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349502] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349507] ?
__pv_queued_spin_lock_slowpath+0xb04/0xf80
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349514] ? __pfx___schedule+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349520] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349525] ?
__call_rcu_common+0x386/0x14b0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349532] schedule+0x75/0x2f0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349538] schedule_timeout+0x16d/0x210
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349542] ?
__pfx_schedule_timeout+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349548] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349553] ?
_raw_spin_lock_irq+0x8b/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349559] ?
__pfx__raw_spin_lock_irq+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349565] ? kasan_save_track+0x14/0x40
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349569]
wait_for_completion+0x14a/0x340
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349573] ?
__pfx_wait_for_completion+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349577] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349582] ? __pfx_mutex_unlock+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349587] ceph_mdsc_sync+0x4b4/0xe80
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349593] ?
__pfx_ceph_mdsc_sync+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349597] ?
ceph_osdc_put_request+0x38/0x770
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349603] ? ceph_osdc_sync+0x1cb/0x350
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349608] ceph_sync_fs+0xa0/0x4c0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349612] sync_filesystem+0x182/0x240
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349618] __x64_sys_syncfs+0xac/0x160
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349623] x64_sys_call+0x746/0x2360
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349629] do_syscall_64+0x82/0x5d0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349635] ? __x64_sys_openat+0x108/0x240
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349641] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349647] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349655] ?
__pfx___x64_sys_openat+0x10/0x10
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349661] ?
__kasan_check_write+0x14/0x30
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349667] ? ksys_write+0x1a3/0x230
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349672] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349677] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349682] ? do_syscall_64+0xbf/0x5d0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349687] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349692] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349705] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349709] ? do_syscall_64+0xbf/0x5d0
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349715] ? __kasan_check_read+0x11/0x20
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349720] ?
fpregs_assert_state_consistent+0x5c/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349724] ? irqentry_exit+0xa5/0x600
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349730] ? exc_page_fault+0x95/0x100
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349736]
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349740] RIP: 0033:0x792fb1d1ba4b
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349745] RSP: 002b:00007ffc3844eb58
EFLAGS: 00000246 ORIG_RAX: 0000000000000132
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349752] RAX: ffffffffffffffda RBX:
0000000000000000 RCX: 0000792fb1d1ba4b
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349756] RDX: 0000000000000000 RSI:
000059045610b440 RDI: 0000000000000004
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349759] RBP: 0000000000000004 R08:
0000000000000026 R09: 00007ffc3844e986
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349762] R10: 0000000000000000 R11:
0000000000000246 R12: 0000000000000149
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349765] R13: 00007ffc3844eba0 R14:
000059042de9d0b3 R15: 0000000000000149
Feb 16 15:46:30 ceph-0005 kernel: [ 1845.349771] </TASK>
l *ceph_mdsc_sync+0x4b4
0xffffffff82cddbe4 is in ceph_mdsc_sync (fs/ceph/mds_client.c:5916).
5911 }
5912 doutc(cl, "wait on %llu (want %llu)\n",
5913 req->r_tid, want_tid);
5914 wait_for_completion(&req->r_safe_completion);
5915
5916 mutex_lock(&mdsc->mutex);
5917 ceph_mdsc_put_request(req);
5918 if (!nextreq)
5919 break; /* next dne before, so we're done! */
5920 if (RB_EMPTY_NODE(&nextreq->r_node)) {
I am not sure yet that reason is the same.
Thanks,
Slava.