Re: linux-5.15.69 breaks nfs client

From: Thorsten Leemhuis
Date: Fri Sep 23 2022 - 03:47:03 EST


Hi, this is your Linux kernel regression tracker. CCing the regression
mailing list, as it should be in the loop for all regressions, as
explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
Also CCing the stable ml, the NFS maintainers, and the authors of
31b992b3c39b, too.

On 22.09.22 23:46, Kurt Garloff wrote:
>
> a freshly compiled 5.15.69 kernel showed hangs with NFS.
> Typically mkdir would end up in a 'D' process state, but I
> have seen ls -l hanging as well.
> Server is kernel NFS 5.15.69.
>
> After reverting the last three NFS related commits,
> a68a734b19af NFS: Fix WARN_ON due to unionization of nfs_inode.nrequests
> 3b97deb4abf5 NFS: Fix another fsync() issue after a server reboot
> 31b992b3c39b NFS: Save some space in the inode
>
> things work normally again.
>
> As you can see, I suspected 31b992b3c39b ...

FWIW, that's e591b298d7ec in mainline.

> I know this report is light on details; if nothing like this has been
> reported yet, let me know and I'll try to find some time to investigate
> further.
>
> PS: Please keep me on Cc, I'm not subscribed to linux-nfs.

I wonder if this is this is a dup of this report:

https://lore.kernel.org/all/c5d8485b-0dbc-5192-4dc6-10ef2b86b520@xxxxxxxxxxxxx/

In that thread Trond mentioned
```
I believe this is a dependency that was introduced by the back port of
commit e591b298d7ec ("NFS: Save some space in the inode") into 5.15.68.
So the reason it wasn't seen is because the change is very recent.

FYI Greg and Sasha: please also consider pulling 6e176d47160c ("NFSv4:
Fixes for nfs4_inode_return_delegation()") into that stable series.
```

Anyway, for the rest of this mail:
[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced 31b992b3c39b
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.