Re: [PATCH 1/1] nfs: return a write delegation when a SETATTR drops our write access

From: Trond Myklebust

Date: Fri May 29 2026 - 10:00:41 EST


On Thu, 2026-05-28 at 15:22 -0400, Benjamin Coddington wrote:
> A client holding an OPEN_DELEGATE_WRITE delegation can satisfy a
> later
> open(O_WRONLY) from the cached delegation (can_open_delegated())
> without
> sending an OPEN to the server. That cached "open for write" assertion
> is
> only valid while the delegation holder still has write access. A
> SETATTR
> that changes mode, owner, or group can revoke that access -- after
> which an
> open served from the delegation would bypass an access check the
> server
> would now fail, and, against a server that recalls the delegation on
> such a
> change, the SETATTR draws a CB_RECALL/NFS4ERR_DELAY/DELEGRETURN/retry
> round
> trip.
>
> Before issuing such a SETATTR, check whether the proposed
> mode/owner/group
> would remove write access for the delegation's owning credential,
> judged by
> the resulting POSIX mode bits. If so, return the delegation first:
> the
> return is synchronous and flushes modified data, so the SETATTR
> proceeds on
> an open or special stateid and the next open revalidates access with
> the
> server. Permission changes that keep the holder's write access leave
> the
> delegation in place.
>
> Only the mode bits and the holder's fsuid/fsgid are consulted. An
> NFSv4 ACL
> cannot be evaluated by the client, a privileged caller may retain
> access the
> bits deny, and supplementary group membership is not checked, so the
> test is
> necessarily approximate -- but an inexact answer costs at most an
> unnecessary delegation return or a fall back to the server's recall,
> never
> incorrect access.
>
> RFC 8881 Section 10.4.4 permits a client to return a delegation
> voluntarily,
> performing the same pre-return state updates (data flush, pending
> truncation, CLOSE/OPEN/LOCK) it would on a recall. Commit
> c01d36457dcc
> ("NFSv4: Don't return the delegation when not needed by NFSv4.x
> (x>0)")
> stopped returning write delegations on SETATTR for NFSv4.1+, since
> the
> server can identify the delegation holder from the SEQUENCE clientid
> and
> need not recall. That holds for changes that do not affect the
> holder's
> access; restore a return only for the narrow case where the holder's
> own
> write access is removed.

Hmmm... I'd argue that while recalling the delegation in this case is
mandatory for NFSv4.0, that is certainly not true for NFSv4.1.

Furthermore, I'd argue that if the holder of a write delegation is just
changing the mode, then that should never result in a delegation recall
for a well written NFSv4.1 server. The reason is this does not impact
the client's ability to cache data, metadata or lock state. It only
impacts its ability to rely on previously cached access data when
handling new opens.

One can argue whether or not it's needed for a uid or gid change by
said holder of the delegation, but there too I'd say the right
behaviour is to err on the side of not recalling.
The exception might be if this is an attribute delegation, and the
result will be that the cred attached to the delegation will no longer
be able to issue a SETATTR to update the atime/mtime on delegation
return.

So yes to pre-emptive invalidation of the access cache, but I'm very
sceptical to actually pre-emptively returning the delegation or even
the layouts.

>
> Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxxxxxxx>
> ---
>  fs/nfs/nfs4proc.c | 66 ++++++++++++++++++++++++++++++++++++++++++++-
> --
>  1 file changed, 62 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index a9b8d482d289..e4b7322bf75c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -4506,7 +4506,55 @@ int nfs4_proc_getattr(struct nfs_server
> *server, struct nfs_fh *fhandle,
>   return err;
>  }
>  
> -/*
> +/*
> + * Would applying @sattr (which changes mode, owner, and/or group)
> remove the
> + * write access of a held write delegation's owning credential, as
> judged by
> + * the resulting file mode bits?
> + *
> + * Such a change makes the delegation's cached "open for write"
> assertion
> + * stale: a later open(O_WRONLY) could be served from the delegation
> without
> + * the server getting a chance to deny it.  Only the mode bits and
> the
> + * holder's fsuid/fsgid are consulted; an NFSv4 ACL (which the
> client cannot
> + * evaluate locally), a privileged caller, or supplementary group
> membership
> + * may make the answer imprecise, but the cost is at most an
> unnecessary
> + * delegation return or a fall back to the server's recall -- never
> incorrect
> + * access.
> + */
> +static bool nfs4_setattr_removes_write(struct inode *inode, struct
> iattr *sattr)
> +{
> + struct nfs_delegation *delegation;
> + const struct cred *cred;
> + umode_t mode = inode->i_mode;
> + kuid_t uid = inode->i_uid;
> + kgid_t gid = inode->i_gid;
> + bool ret = false;
> +
> + delegation = nfs4_get_valid_delegation(inode);
> + if (!delegation)
> + return false;
> + if (!(delegation->type & FMODE_WRITE))
> + goto out;
> + cred = delegation->cred;
> +
> + if (sattr->ia_valid & ATTR_MODE)
> + mode = sattr->ia_mode;
> + if (sattr->ia_valid & ATTR_UID)
> + uid = sattr->ia_uid;
> + if (sattr->ia_valid & ATTR_GID)
> + gid = sattr->ia_gid;
> +
> + if (uid_eq(uid, cred->fsuid))
> + ret = !(mode & S_IWUSR);
> + else if (gid_eq(gid, cred->fsgid))
> + ret = !(mode & S_IWGRP);
> + else
> + ret = !(mode & S_IWOTH);
> +out:
> + nfs_put_delegation(delegation);
> + return ret;
> +}
> +
> +/*
>   * The file is not closed if it is opened due to the a request to
> change
>   * the size of the file. The open call will not be needed once the
>   * VFS layer lookup-intents are implemented.
> @@ -4555,9 +4603,19 @@ nfs4_proc_setattr(struct dentry *dentry,
> struct nfs_fattr *fattr,
>   cred = ctx->cred;
>   }
>  
> - /* Return any delegations if we're going to change ACLs */
> - if ((sattr->ia_valid & (ATTR_MODE|ATTR_UID|ATTR_GID)) != 0)
> - nfs4_inode_make_writeable(inode);
> + /*
> + * A change to mode, owner, or group that removes the write
> + * delegation holder's own write access makes the
> delegation's cached
> + * "open for write" stale; return it so a later open()
> revalidates
> + * access with the server.  A change that keeps write access
> leaves
> + * the delegation in place.
> + */
> + if (sattr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) {
> + if (nfs4_setattr_removes_write(inode, sattr))
> + nfs4_inode_return_delegation(inode);
> + else
> + nfs4_inode_make_writeable(inode);
> + }
>  
>   status = nfs4_do_setattr(inode, cred, fattr, sattr, ctx,
> NULL);
>   if (status == 0) {

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@xxxxxxxxxx, trond.myklebust@xxxxxxxxxxxxxxx