Re: [PATCH 5.15 00/57] 5.15.155-rc1 review

From: Chuck Lever
Date: Fri Apr 12 2024 - 16:24:27 EST


On Sat, Apr 13, 2024 at 01:41:52AM +0530, Harshit Mogalapalli wrote:
> Hi Greg, Chuck,
>
> On 12/04/24 21:27, Chuck Lever III wrote:
> >
> >
> > > I have noticed a regression in lts test case with nfsv4 and this was overlooked in the previous cycle(5.15.154). So the regression is from 153-->154 update. And I think that is due to nfs backports we had in 5.15.154.
> > >
> > > # ./runltp -d /tmpdir -s fcntl17
> > >
> > > <<<test_start>>>
> > > tag=fcntl17 stime=1712915065
> ...
> > > fcntl17 1 TFAIL : fcntl17.c:429: Alarm expired, deadlock not detected
> > > fcntl17 0 TWARN : fcntl17.c:430: You may need to kill child processes by hand
> > > fcntl17 2 TPASS : Block 1 PASSED
> > > fcntl17 0 TINFO : Exit block 1
> > > fcntl17 0 TWARN : tst_tmpdir.c:342: tst_rmdir: rmobj(/tmpdir/ltp-jRFBtBQhhx/LTP_fcn9Xy4hM) failed: unlink(/tmpdir/ltp-jRFBtBQhhx/LTP_fcn9Xy4hM) failed; errno=2: ENOENT
> > >
> > >
> > > Steps used after installing latest ltp:
> > >
> > > $ mkdir /tmpdir
> > > $ yum install nfs-utils -y
> > > $ echo "/media *(rw,no_root_squash,sync)" >/etc/exports
> > > $ systemctl start nfs-server.service
> > > $ mount -o rw,nfsvers=3 127.0.0.1:/media /tmpdir
> > > $ cd /opt/ltp
> > > $ ./runltp -d /tmpdir -s fcntl17
> > >
> > >
> > >
> > > This does not happen in 5.15.153 tag.
> > >
> > > Adding nfs people to the CC list
> >
> > The reproducer uses NFSv3, but the bug report says NFSv4
> > at the top.
> >
> > I was able to reproduce this on my nfsd-5.15.y branch
> > with NFSv3.
> >
> > A bisect would be most helpful.
> >
>
> I was able to bisect: here are the results:
>
>
>
> 2267b2e84593bd3d61a1188e68fba06307fa9dab is the first bad commit
> commit 2267b2e84593bd3d61a1188e68fba06307fa9dab
> Author: Alexander Aring <aahringo@xxxxxxxxxx>
> Date: Tue Sep 12 17:53:18 2023 -0400
>
> lockd: introduce safe async lock op
>
> [ Upstream commit 2dd10de8e6bcbacf85ad758b904543c294820c63 ]
>
> This patch reverts mostly commit 40595cdc93ed ("nfs: block notification
> on fs with its own ->lock") and introduces an EXPORT_OP_ASYNC_LOCK
> export flag to signal that the "own ->lock" implementation supports
> async lock requests. The only main user is DLM that is used by GFS2 and
> OCFS2 filesystem. Those implement their own lock() implementation and
> return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed
> ("nfs: block notification on fs with its own ->lock") the DLM
> implementation were never updated. This patch should prepare for DLM
> to set the EXPORT_OP_ASYNC_LOCK export flag and update the DLM
> plock implementation regarding to it.
>
> Acked-by: Jeff Layton <jlayton@xxxxxxxxxx>
> Signed-off-by: Alexander Aring <aahringo@xxxxxxxxxx>
> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>
> Documentation/filesystems/nfs/exporting.rst | 7 +++++++
> fs/lockd/svclock.c | 4 +---
> fs/nfsd/nfs4state.c | 10 +++++++---
> include/linux/exportfs.h | 14 ++++++++++++++
> 4 files changed, 29 insertions(+), 6 deletions(-)
>
> Bisect log:
> ==========
>
> git bisect start
> # status: waiting for both good and bad commits
> # bad: [cdfd0a7f01396303e9d4fb3513a1127636f12e5e] Linux 5.15.154
> git bisect bad cdfd0a7f01396303e9d4fb3513a1127636f12e5e
> # status: waiting for good commit(s), bad commit known
> # good: [9465fef4ae351749f7068da8c78af4ca27e61928] Linux 5.15.153
> git bisect good 9465fef4ae351749f7068da8c78af4ca27e61928
> # good: [4420d19ed4e4fe2adc9bed8a49bf195db1137458] NFSD: Report average age
> of filecache items
> git bisect good 4420d19ed4e4fe2adc9bed8a49bf195db1137458
> # good: [94e412c945e64579798204aee7bc669d0acfaf79] nfsd: fix courtesy client
> with deny mode handling in nfs4_upgrade_open
> git bisect good 94e412c945e64579798204aee7bc669d0acfaf79
> # bad: [254f1c2521716cafc63530750ce313059f5d5979] iwlwifi: mvm: rfi: use
> kmemdup() to replace kzalloc + memcpy
> git bisect bad 254f1c2521716cafc63530750ce313059f5d5979
> # bad: [e635f652696ef6f1230621cfd89c350cb5ec6169] serial: sc16is7xx: convert
> from _raw_ to _noinc_ regmap functions for FIFO
> git bisect bad e635f652696ef6f1230621cfd89c350cb5ec6169
> # good: [05b452e8748bcf92c00725691437e16d46af7c28] nfsd: Fix creation time
> serialization order
> git bisect good 05b452e8748bcf92c00725691437e16d46af7c28
> # bad: [ccd9fe71b9ee46ebcecec8aec5c4f1e1ddd35dfd] nfsd: Fix a regression in
> nfsd_setattr()
> git bisect bad ccd9fe71b9ee46ebcecec8aec5c4f1e1ddd35dfd
> # bad: [2267b2e84593bd3d61a1188e68fba06307fa9dab] lockd: introduce safe
> async lock op
> git bisect bad 2267b2e84593bd3d61a1188e68fba06307fa9dab
> # good: [56e5eeff6cfa4bd6ffa2b2ae5b8bfc1c28044faf] nfsd: separate
> nfsd_last_thread() from nfsd_put()
> git bisect good 56e5eeff6cfa4bd6ffa2b2ae5b8bfc1c28044faf
> # good: [6e5fed48d8b7b25f8517a1292b62a3a86a5aec91] NFSD: fix possible oops
> when nfsd/pool_stats is closed.
> git bisect good 6e5fed48d8b7b25f8517a1292b62a3a86a5aec91
> # first bad commit: [2267b2e84593bd3d61a1188e68fba06307fa9dab] lockd:
> introduce safe async lock op
>
>
> Hope the above might help.

Nice work. Thanks!


> I didnot test the revert of culprit commit on top of 5.15.154 yet.

Please try reverting that one -- it's very close to the top so one
or two others might need to be pulled off as well.

I expect this is due to a missing pre-requisite commit.


--
Chuck Lever