Re: [PATCH 12/24] nfsd: add data structures for handling CB_NOTIFY
From: Chuck Lever
Date: Wed Apr 08 2026 - 14:41:39 EST
On Tue, Apr 7, 2026, at 9:21 AM, Jeff Layton wrote:
> Add the data structures, allocation helpers, and callback operations
> needed for directory delegation CB_NOTIFY support:
>
> - struct nfsd_notify_event: carries fsnotify events for CB_NOTIFY
> - struct nfsd4_cb_notify: per-delegation state for notification handling
> - Union dl_cb_fattr with dl_cb_notify in nfs4_delegation since a
> delegation is either a regular file delegation or a directory
> delegation, never both
>
> Refactor alloc_init_deleg() into a common __alloc_init_deleg() base
> with a pluggable sc_free callback, and add alloc_init_dir_deleg() which
> allocates the page array and notify4 buffer needed for CB_NOTIFY
> encoding.
>
> Add skeleton nfsd4_cb_notify_ops with done/release handlers that will
> be filled in when the notification path is wired up.
>
> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 4afe7e68fb51..b2b8c454fc0f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3381,6 +3440,30 @@ nfsd4_cb_getattr_release(struct nfsd4_callback
> *cb)
> nfs4_put_stid(&dp->dl_stid);
> }
>
> +static int
> +nfsd4_cb_notify_done(struct nfsd4_callback *cb,
> + struct rpc_task *task)
> +{
> + switch (task->tk_status) {
> + case -NFS4ERR_DELAY:
> + rpc_delay(task, 2 * HZ);
> + return 0;
> + default:
> + return 1;
> + }
> +}
> +
> +static void
> +nfsd4_cb_notify_release(struct nfsd4_callback *cb)
> +{
> + struct nfsd4_cb_notify *ncn =
> + container_of(cb, struct nfsd4_cb_notify, ncn_cb);
> + struct nfs4_delegation *dp =
> + container_of(ncn, struct nfs4_delegation, dl_cb_notify);
> +
> + nfs4_put_stid(&dp->dl_stid);
> +}
> +
> static const struct nfsd4_callback_ops nfsd4_cb_recall_any_ops = {
> .done = nfsd4_cb_recall_any_done,
> .release = nfsd4_cb_recall_any_release,
So when a client responds with NFS4ERR_DELAY, the RPC framework retries
after 2s. On retry, prepare() is called again, but ncn_evt_cnt is
already 0 (drained in the first prepare). prepare returns false, which
destroys the callback.
Events arriving during the retry window are dropped because
nfsd4_run_cb_notify() returns early when NFSD4_CALLBACK_RUNNING is set.
After the callback is destroyed, future events can queue a new CB_NOTIFY,
but the window's events are lost.
The result is that the client misses notifications. Does this impact
behavioral correctness or spec compliance? Is there a way for that
client to detect the loss and recover?
--
Chuck Lever