Re: [Patch 0/2] NFSD: Fix server hang when there are multiple layout conflicts
From: Dai Ngo
Date: Tue Nov 11 2025 - 10:45:29 EST
Hi Ben,
On 11/9/25 10:34 AM, Benjamin Coddington wrote:
On 6 Nov 2025, at 12:05, Dai Ngo wrote:
When a layout conflict triggers a call to __break_lease, the functionHey Dai,
nfsd4_layout_lm_break clears the fl_break_time timeout before sending
the CB_LAYOUTRECALL. As a result, __break_lease repeatedly restarts
its loop, waiting indefinitely for the conflicting file lease to be
released.
If the number of lease conflicts matches the number of NFSD threads (which
defaults to 8), all available NFSD threads become occupied. Consequently,
there are no threads left to handle incoming requests or callback replies,
leading to a total hang of the NFS server.
This issue is reliably reproducible by running the Git test suite on a
configuration using SCSI layout.
This patchset fixes this problem by introducing the new lm_breaker_timedout
operation to lease_manager_operations and using timeout for layout
lease break.
I like your solution here, but I worry it can cause unexpected or
unnecessary client fencing when the problem is server-side (not enough
threads). Clients might be dutifully sending LAYOUTRETURN, but the server
can't service them
I agreed. This is a server problem and we penalize the client. We need
a long term solution for dealing resource shortage (server threads)
problem.
Fortunately, the client can detect reservation conflict errors and appears
to retry the I/O. Also, the client will ask for new layout and in the
process it re-registers its reservation key so I/O will continue.
- and this change will cause some potentially unexpected
fencing in environments where things could be fixed (by adding more knfsd
threads).
Also, I think we significantly bumped default thread counts
recently in nfs-utils:
eb5abb5c60ab (tag: nfs-utils-2-8-2-rc3) nfsd: dump default number of threads to 16
This helps a bit but if there is always a chance that there is a load
that requires more than the number of server threads.
You probably have already seen previous discussions about this:
https://urldefense.com/v3/__https://lore.kernel.org/linux-nfs/1CC82EC5-6120-4EE4-A7F0-019CF7BC762C@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!Pq4vHQs-qk71XjZ0vOkONTD7nxkuyUUEKTBsJJ0L_OrFWudokphCyc2V0q0_OrNoGD3KnsgoHKp7rb_lDcs$
This also changes the behavior for all layouts, I haven't thought through
the implications of that - but I wish we could have knob for this behavior,
or perhaps a knfsd-specific fl_break_time tuneable.
There is already a knob to tune the fl_break_time:
# cat /proc/sys/fs/lease-break-time
but currently lease-break-time is in seconds so the minimum we can set
is 1 which I think is still too long to tight up a server thread.
Last thought (for now): I think Neil has some work for dynamic knfsd thread
count.. or Jeff? (I am having trouble finding it) Would that work around
this problem?
This would help, and I prefer this route rather than rework __break_lease
to return EAGAIN/jukebox while the server recalling the layout.
Thank you for your feedback,
-Dai
Regards,
Ben