RE: Re: RE: [PATCH] Revert "ceph: when filling trace, call ceph_get_inode outside of mutexes"

From: Viacheslav Dubeyko

Date: Wed Apr 29 2026 - 13:40:17 EST


On Tue, 2026-04-28 at 17:04 +0000, Viacheslav Dubeyko wrote:
> On Tue, 2026-04-28 at 05:31 +0000, 李磊 wrote:
> >
> > > >
>
> <skipped>
>
> > >
> > > As far as I can see, I was able to reproduce the issue without your patch
> > > applied. It looks like I was "lucky" to reproduce the issue. I need to take a
> > > deeper look into the issue. But your patch is not responsible for the issue. Let
> > > me spend some time to analyze the issue reason and environment.
> > >
> > > Have you able to reproduce the issue on your side?
> >
> > I’ve tried generic/701 again for several times, but no issues occurred.
> >
>
> I assume that running only generic/701 test-case alone cannot reproduce the
> issue. My understanding is that the whole auto group needs to be executed.
> Somehow, I was lucky enough to reproduce the issue. :)
>
> > Judging only from the stack trace, I believe it’s a deadlock between
> > writeback and memory reclaiming. This dependency chain is as follows
> >
> > 1. Several D threads are waiting for the osdc->lock to be released.
> > 2. The osdc->lock is held by kworker/u32:0:241092 which is performing writeback,
> > and kworker/u32:0:241092 is asking for con->mutex to send_request.
> > 3. The con->mutex is held by kworker/7:4:308292 which is currently calling do_sendmsg()
> >
> > Because of memory shortage, step 3 has to reclaim memory and wait for step 2 to writeback
> > And free some folios. But step 2 is blocked by con->mutex which is already held by step3.
> >
>
> I am going to take a deeper look into the issue today. But, probably, one day
> could be not enough. :)
>
>

As far as I can see, then main problem is the memory pressure issue. And it
looks like that this patch can fix the issue:

From 84572f7fcb7aad422c71c87b52dfea805eeb7fbd Mon Sep 17 00:00:00 2001
From: Hristo Venev <hristo@xxxxxxxxxx>
Date: Mon, 6 Apr 2026 16:01:20 +0300
Subject: [PATCH] ceph: put folios not suitable for writeback

The batch holds references to the folios (see `filemap_get_folios`,
`folio_batch_release`), so we need to `folio_put` the folios we remove.
---
fs/ceph/addr.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 390f122feeaa..10110af74715 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1329,6 +1329,7 @@ int ceph_process_folio_batch(struct address_space
*mapping,
if (rc == -ENODATA) {
rc = 0;
folio_unlock(folio);
+ folio_put(folio);
ceph_wbc->fbatch.folios[i] = NULL;
continue;
} else if (rc == -E2BIG) {
@@ -1340,6 +1341,7 @@ int ceph_process_folio_batch(struct address_space
*mapping,
if (!folio_clear_dirty_for_io(folio)) {
doutc(cl, "%p !folio_clear_dirty_for_io\n", folio);
folio_unlock(folio);
+ folio_put(folio);
ceph_wbc->fbatch.folios[i] = NULL;
continue;
}
--
2.53.0

I am going to run the xfstests without this patch again to check that issue will
be reproducible again.

Thanks,
Slava.