Re: [PATCH] ceph: fix hanging __ceph_get_caps() with stale `mds_wanted`
From: Max Kellermann
Date: Wed Apr 29 2026 - 00:28:13 EST
On Tue, Apr 28, 2026 at 8:46 PM Viacheslav Dubeyko
<Slava.Dubeyko@xxxxxxx> wrote:
> > flags |= NON_BLOCKING;
> > while (!(ret = try_get_cap_refs(inode, need, want,
> > endoff, flags, &_got))) {
> > + static const unsigned long wait_timeout = 5 * HZ;
>
> Why exactly 5 * HZ? What is the basis for this timeout? Could we re-use any
> available timeouts in CephFS declarations?
It is an arbitrary timeout, long enough to avoid unnecessary wakeups
in regular situations where we're really waiting for a capability, but
short enough to avoid disrupting the service. If you prefer another
number, say it, and I'll change it.
On our servers, this hang bug occurs very rarely. Sometimes, weeks go
by without a hang, and sometimes twice a day, but no more. When it
happens, I thought it's fine to wait 5 seconds before attempting to
recover.
Note that this is still a minimal workaround, not a proper fix, as I
wrote. The proper fix would be much more intrusive (and of course go
without any hard-coded timeouts and arbitrary wakeups).
> > if (ret == -EUCLEAN) {
> > - /* session was killed, try renew caps */
> > - ret = ceph_renew_caps(inode, flags);
> > + /* session was killed or a waited cap
> > + * request needs a retry */
> > + ret = ceph_renew_caps(inode, flags & CEPH_FILE_MODE_MASK);
>
> Frankly speaking, I don't quite follow why do we need to add flags &
> CEPH_FILE_MODE_MASK?
Oh, this is an unrelated fix. This "flags" variable contains internal
flags that are not understood by ceph_renew_caps() (i.e.
CHECK_FILELOCK and NON_BLOCKING) and shouldn't really be passed there.
I'll remove that change from the patch for v2 because while I believe
it's correct, it has nothing to do with this bug. I'll post v2 when
we agree on the rest of the patch.
--
Max Kellermann
Principal Architect
Hosting Technology
cm4all | Im Mediapark 6a | 50670 Köln | Germany
General information about the company can be found here:
https://www.cm4all.com/impressum
A member of the IONOS Group