Re: [PATCH v1 07/13] ceph: add timeout to caps wait in __ceph_get_caps()
From: Viacheslav Dubeyko
Date: Thu Mar 12 2026 - 15:52:59 EST
On Thu, 2026-03-12 at 10:16 +0200, Ionut Nechita (Wind River) wrote:
> From: Ionut Nechita <ionut.nechita@xxxxxxxxxxxxx>
>
> When waiting for caps in __ceph_get_caps(), the code uses
> wait_woken() with MAX_SCHEDULE_TIMEOUT, which can block
> indefinitely if the MDS is unavailable or slow to grant caps
> during reconnection.
>
> This causes hung task warnings when MDS fails over:
>
> INFO: task dd:12345 blocked for more than 122 seconds.
> Call Trace:
> __ceph_get_caps+0x...
> ceph_write_iter+0x...
>
> During MDS failover, caps may be revoked or delayed while the
> client reconnects. Processes waiting for caps block indefinitely,
> also holding i_rwsem which blocks other I/O operations on the
> same inode, causing a cascade of blocked processes.
>
> Fix this by using wait_woken() with mount_timeout instead of
> MAX_SCHEDULE_TIMEOUT. On timeout, return -ETIMEDOUT to allow
> the caller to handle the situation appropriately.
>
> Signed-off-by: Ionut Nechita <ionut.nechita@xxxxxxxxxxxxx>
> ---
> fs/ceph/caps.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index bed34fc11c919..c88e10a634e5c 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -3055,7 +3055,10 @@ int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, int need,
> {
> struct ceph_inode_info *ci = ceph_inode(inode);
> struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode);
> + struct ceph_client *cl = fsc->client;
> + unsigned long timeout = ceph_timeout_jiffies(cl->options->mount_timeout);
The same concern about timeout value. :)
> int ret, _got, flags;
> + bool warned = false;
Technically speaking, you are trying to create pr_warn_once_client() here.
Probably, we need to prefer this one instead of pr_warn_ratelimited_client().
However, maybe, we need to have debug output instead. This warned variable
completely not necessary here.
Thanks,
Slava.
>
> ret = ceph_pool_perm_check(inode, need);
> if (ret < 0)
> @@ -3104,7 +3107,18 @@ int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, int need,
> ret = -ERESTARTSYS;
> break;
> }
> - wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
> + if (!wait_woken(&wait, TASK_INTERRUPTIBLE, timeout)) {
> + if (!warned) {
> + pr_warn_ratelimited_client(cl,
> + "%p %llx.%llx caps wait timed out (need %s want %s)\n",
> + inode, ceph_vinop(inode),
> + ceph_cap_string(need),
> + ceph_cap_string(want));
> + warned = true;
> + }
> + ret = -ETIMEDOUT;
> + break;
> + }
> }
>
> remove_wait_queue(&ci->i_cap_wq, &wait);