Re: [PATCH] ceph: add timeout protection to ceph_mdsc_sync() path

From: Sebastian Andrzej Siewior

Date: Wed Feb 11 2026 - 02:22:17 EST


On 2026-02-08 15:18:20 [+0200], Ionut Nechita (Wind River) wrote:
> From: Ionut Nechita <ionut.nechita@xxxxxxxxxxxxx>
>
> When Ceph MDS becomes unreachable (e.g., due to IPv6 EADDRNOTAVAIL
> during DAD or network transitions), the sync syscall can block
> indefinitely in ceph_mdsc_sync(). The hung_task detector fires
> repeatedly (122s, 245s, 368s... up to 983+ seconds) with traces like:
>
> INFO: task sync:12345 blocked for more than 122 seconds.
> Call Trace:
> ceph_mdsc_sync+0x4d6/0x5a0 [ceph]
> ceph_sync_fs+0x31/0x130 [ceph]
> iterate_supers+0x97/0x100
> ksys_sync+0x32/0xb0
>
> Three functions in the MDS sync path use indefinite waits:
>
> 1. wait_caps_flush() uses wait_event() with no timeout
> 2. flush_mdlog_and_wait_mdsc_unsafe_requests() uses
> wait_for_completion() with no timeout
> 3. ceph_mdsc_sync() returns void, cannot propagate errors
>
> This is particularly problematic in Kubernetes environments with
> PREEMPT_RT kernels where Ceph storage pods undergo rolling updates
> and IPv6 network reconfigurations cause temporary MDS unavailability.

I may have misunderstood this but how is this different from a
!PREEMPT_RT kernel? As far as I understand, there should be no
difference in how both kernels react to the situation.
Could you check with lockdep and might_sleep if there a locking problem
and some kind of state is lost or wrongly interpreted?

Sebastian