Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)

From: Brian Pardy
Date: Wed Sep 06 2023 - 17:04:05 EST


Added committer Ronnie Sahlberg to CC.

On Tue, Sep 5, 2023 at 9:01 PM Bagas Sanjaya <bagasdotme@xxxxxxxxx> wrote:
> On Tue, Sep 05, 2023 at 01:09:05PM -0400, Brian Pardy wrote:
> > I've noticed an issue with the CIFS client in kernel 6.5.0/6.5.1 that
> > does not exist in 6.4.12 or other previous kernels (I have not tested
> > 6.4.13). Almost immediately after mounting a CIFS share, the reported
> > load average on my system goes up by 2. At the time this occurs I see
> > two [cifsd-cfid-laundromat] kernel threads running the "D" state,
> > where they remain for the entire time the CIFS share is mounted. The
> > load will remain stable at 2 (otherwise idle) until the share is
> > unmounted, at which point the [cifsd-cfid-laundromat] threads
> > disappear and load drops back down to 0. This is easily reproducible
> > on my system, but I am not sure what to do to retrieve more useful
> > debugging information. If I mount two shares from this server, I get
> > four laundromat threads in "D" state and a sustained load average of
> > 4.
> >
> > The client is running Gentoo Linux, the server is a Seagate Personal
> > Cloud NAS running Samba 4.6.5. Mount options used are
> > "noperm,guest,vers=3.02". The CPUs do not actually appear to be
> > spinning, the reported load average appears incorrect as far as actual
> > CPU use is concerned.
>
> Thanks for the regression report. But if you want to get it fixed,
> you have to do your part: perform bisection. See Documentation/admin-guide/bug-bisect.rst in the kernel sources for how to do that.
>
> Anyway, I'm adding it to regzbot:
>
> #regzbot ^introduced: v6.4..v6.5
> #regzbot title: incorrect CPU utilization report (multiplied) when mounting CIFS

Thank you for directing me to the bug-bisect documentation. Results below:

# git bisect bad
d14de8067e3f9653cdef5a094176d00f3260ab20 is the first bad commit
commit d14de8067e3f9653cdef5a094176d00f3260ab20
Author: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
Date: Thu Jul 6 12:32:24 2023 +1000

cifs: Add a laundromat thread for cached directories

and drop cached directories after 30 seconds

Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx>

fs/smb/client/cached_dir.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
fs/smb/client/cached_dir.h | 1 +
2 files changed, 68 insertions(+)

I do not know what other debug info may be useful, but here is
/proc/[pid]/stack output for one of these threads in D state:

# cat /proc/17314/stack
[<0>] msleep+0x24/0x40
[<0>] cifs_cfids_laundromat_thread+0x5e/0x1c0 [cifs]
[<0>] kthread+0xc4/0xf0
[<0>] ret_from_fork+0x28/0x40
[<0>] ret_from_fork_asm+0x1b/0x30

I will provide any other details requested. Thank you.

#regzbot introduced: d14de8067e3