Re: 答复: 答复: 答复: 【外部邮件!】Re: [PATCH v2] ceph: fix deadlock in ceph_readdir_prepopulate

From: Viacheslav Dubeyko

Date: Wed Jan 21 2026 - 15:51:49 EST

On Wed, 2026-01-21 at 07:33 +0000, 李磊 wrote:
> Hi Slava,
>
> Zhao and I have found a way to reproduce this issue.

Sounds great!

>
> 1. try to find 2 different directories (DIR_a DIR_b) in a cephfs cluster and make sure they have different auth mds nodes. In this
> way, a client may have chances to run handle_reply on different CPU for our test (see step 4 and step 6).
> 2. In DIR_b, create a hard link of DIR_a/FILE_a, namely FILE_b. DIR_a/FILE_a and DIR_b/FILE_b have the same ino (123456 e.g)
> 3. Save ino in code below, make it sleep for stat command.
> ```
> @@ -3950,6 +3951,10 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
> goto out_err;
> }
> req->r_target_inode = in;
> + if (in->i_ino == 123456) {
> + pr_err("inode %lu found, ready to wait 10 seconds.\n", in->i_ino);
> + msleep(10000);
> + }
> ```
> 4. echo 3 > /proc/sys/vm/drop_caches
> 5. in a shell, do `stat DIR_a/FILE_a`, we suppose to be stuck on this shell because of msleep() in handle_reply().
> 6. in the other shell, do `ls DIR_b/` to trigger ceph_readdir_prepopulate()
>
> Repeat step 4 to step 6 for several times (5 times is enough I guess). And we'll see the deadlock.
>

I am guessing... Is it possible to create some Ceph specific test-case in
xfstests suite? It will be great to have some test-case or unit-test for
checking this issue in the future.

OK. I suggest to add this reproduction path and other already shared
explanation/analysis into commit message and re-send the patch. Could you please
send the new version of the patch?

Thanks,
Slava.

>
> ________________________________________
> 发件人: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
> 发送时间: 2026年1月8日 3:59
> 收件人: 李磊
> 抄送: Alex Markuze; idryomov@xxxxxxxxx; 孙朝; linux-kernel@xxxxxxxxxxxxxxx; ceph-devel@xxxxxxxxxxxxxxx
> 主题: Re: 答复: 答复: 【外部邮件!】Re: [PATCH v2] ceph: fix deadlock in ceph_readdir_prepopulate
>
> On Wed, 2026-01-07 at 16:01 +0000, 李磊 wrote:
> > Hi Slava,
> >
> > This issue is very rare on our internal cephfs clusters. We had only encountered it for about three times.
> > But we are working on same hacking methods to speed up the reproduction. I think it will take me one week
> > if everything goes smoothly and I will share the methods here.
> >
> > To be honest, this patch should be a revert patch of this one:
> >
> > commit : bca9fc14c70fcbbebc84954cc39994e463fb9468
> > ceph: when filling trace, call ceph_get_inode outside of mutexes
> >
> > I'll resend this patch later.
>
> Sounds good. If I remember correctly, the main issue with the initial patch was
> the commit message that didn't have good explanation of the issue and why this
> revert can fix the issue. So, if we have all of these details in the commit
> message, then the patch should be in good shape.
>
> Thanks,
> Slava.