Re: [PATCH] NFSv4: clear exception state on successful mkdir retry
From: Anna Schumaker
Date: Wed Jun 10 2026 - 10:40:08 EST
Hi Thorsten,
On Tue, Jun 9, 2026, at 6:05 AM, Thorsten Leemhuis wrote:
> On 5/13/26 09:18, Thorsten Leemhuis wrote:
>> [top-posting to facilitate processing]
>>
>> @NFSv4 maintainers, just wondering, did this patch maybe fall through
>> the cracks? It fixes a regression, that's why it's on my radar. Or was
>> there some progress and I missed it?
The patch is in my linux-next branch here: https://git.linux-nfs.org/?p=anna/linux-nfs.git;a=commit;h=238e9b51aa29f48b6243212a3b75c8e48d6b96fd
It'll be included when the merge window opens this weekend.
Anna
>
> Still no progress afaics. Feels like I'm missing something obvious or
> like I'm totally of track.
>
> Igor, Neil, is that the case? Or are you also waiting for the fix to
> make progress?
>
> Ciao, Thorsten
>
>> On 4/29/26 12:49, Igor Raits wrote:
>>> After a server returns NFS4ERR_DELAY for an NFSv4 CREATE issued by
>>> mkdir(2), the client correctly waits and retries. When the retry
>>> succeeds, however, mkdir(2) can still surface -EEXIST to userspace
>>> even though the directory was just created on the server.
>>>
>>> Reproducer (random 16-hex names so collisions are not the cause)
>>> against an in-kernel Linux nfsd; reproduces under both NFSv4.0 and
>>> NFSv4.2:
>>>
>>> N=2000000; base=/var/gdc/export
>>> for ((i=1; i<=N; i++)); do
>>> d=$base/$(openssl rand -hex 8)
>>> mkdir "$d" 2>/dev/null || echo "$(date +%T) failed loop=$i $d"
>>> rmdir "$d" 2>/dev/null
>>> done
>>>
>>> Failures cluster at the cadence at which the server-side auth/export
>>> cache refresh path causes nfsd to return NFS4ERR_DELAY for CREATE.
>>>
>>> A wire trace of one failure (the three CREATE RPCs all come from a
>>> single mkdir(2), generated by the do-while in nfs4_proc_mkdir()):
>>>
>>> client -> server CREATE name=... -> NFS4ERR_DELAY
>>> ~100 ms later
>>> client -> server CREATE name=... -> NFS4_OK (dir created)
>>> ~80 us later
>>> client -> server CREATE name=... -> NFS4ERR_EXIST (correct)
>>>
>>> Since commit dd862da61e91 ("nfs: fix incorrect handling of large-number
>>> NFS errors in nfs4_do_mkdir()"), nfs4_handle_exception() is called only
>>> when _nfs4_proc_mkdir() returned an error. That gate breaks retry-state
>>> hygiene: nfs4_do_handle_exception() resets exception.{delay,recovering,
>>> retry} to 0 on entry, so calling it on success is what previously
>>> cleared the retry flag set by the preceding NFS4ERR_DELAY iteration.
>>> With the gate in place, exception.retry stays at 1 after the successful
>>> retry, the loop runs once more, and the resulting CREATE for an
>>> already-created name yields NFS4ERR_EXIST -> -EEXIST to userspace.
>>>
>>> Drop the conditional and call nfs4_handle_exception() unconditionally,
>>> matching every other do-while in fs/nfs/nfs4proc.c (nfs4_proc_symlink(),
>>> nfs4_proc_link(), etc.). The dentry/status separation introduced by
>>> that commit is preserved.
>>>
>>> Fixes: dd862da61e91 ("nfs: fix incorrect handling of large-number NFS errors in nfs4_do_mkdir()")
>>> Reported-and-tested-by: Jan Čípa <jan.cipa@xxxxxxxxxxxx>
>>> Closes: https://lore.kernel.org/linux-nfs/CA+9S74hSp_tJu2Ffe2BPNC2T25gfkhgjjDkdgSsF5c2rnJq_wA@xxxxxxxxxxxxxx/
>>> Reviewed-by: NeilBrown <neil@xxxxxxxxxx>
>>> Cc: stable@xxxxxxxxxxxxxxx
>>> Signed-off-by: Igor Raits <igor.raits@xxxxxxxxx>
>>> ---
>>> fs/nfs/nfs4proc.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index a0885ae55abc..ffd14141ea1d 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -5393,10 +5393,9 @@ static struct dentry *nfs4_proc_mkdir(struct inode *dir, struct dentry *dentry,
>>> do {
>>> alias = _nfs4_proc_mkdir(dir, dentry, sattr, label, &err);
>>> trace_nfs4_mkdir(dir, &dentry->d_name, err);
>>> + err = nfs4_handle_exception(NFS_SERVER(dir), err, &exception);
>>> if (err)
>>> - alias = ERR_PTR(nfs4_handle_exception(NFS_SERVER(dir),
>>> - err,
>>> - &exception));
>>> + alias = ERR_PTR(err);
>>> } while (exception.retry);
>>> nfs4_label_release_security(label);
>>>
>>