Re: CVE-2024-50106: nfsd: fix race between laundromat and free_stateid

From: Chuck Lever
Date: Tue Dec 17 2024 - 20:56:01 EST


On 12/17/24 10:59 AM, Greg Kroah-Hartman wrote:
On Tue, Dec 17, 2024 at 11:30:41PM +0800, Li Lingfeng wrote:
Hi,
after analysis, we think that this issue is not introduced by commit
2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by
clp->cl_lock") but by commit 83e733161fde ("nfsd: avoid race after
unhash_delegation_locked()").
Therefore, kernel versions earlier than 6.9 do not involve this issue.

// normal case 1 -- free deleg by delegreturn
1) OP_DELEGRETURN
nfsd4_delegreturn
 nfsd4_lookup_stateid
 destroy_delegation
  destroy_unhashed_deleg
   nfs4_unlock_deleg_lease
    vfs_setlease // unlock
 nfs4_put_stid // put last refcount
  idr_remove // remove from cl_stateids
  s->sc_free // free deleg

2) OP_FREE_STATEID
nfsd4_free_stateid
 find_stateid_locked // can not find the deleg in cl_stateids


// normal case 2 -- free deleg by laundromat
nfs4_laundromat
 state_expired
 unhash_delegation_locked // set NFS4_REVOKED_DELEG_STID
 list_add // add the deleg to reaplist
 list_first_entry // get the deleg from reaplist
 revoke_delegation
  destroy_unhashed_deleg
   nfs4_unlock_deleg_lease
   nfs4_put_stid


// abnormal case
nfs4_laundromat
 state_expired
 unhash_delegation_locked
  // set NFS4_REVOKED_DELEG_STID
 list_add
  // add the deleg to reaplist
                                1) OP_DELEGRETURN
                                nfsd4_delegreturn
                                 nfsd4_lookup_stateid
nfsd4_stid_check_stateid_generation
                                  nfsd4_verify_open_stid
                                   // check NFS4_REVOKED_DELEG_STID
                                   // and return nfserr_deleg_revoked
                                 // skip destroy_delegation

                                2) OP_FREE_STATEID
                                nfsd4_free_stateid
                                 // check NFS4_REVOKED_DELEG_STID
                                 list_del_init
                                  // remove deleg from reaplist
                                 nfs4_put_stid
                                  // free deleg
 list_first_entry
  // cant not get the deleg from reaplist


Before commit 83e733161fde ("nfsd: avoid race after
unhash_delegation_locked()"), nfs4_laundromat --> unhash_delegation_locked
would not set NFS4_REVOKED_DELEG_STID for the deleg.
So the description "it marks the delegation stid revoked" in the CVE fix
patch does not hold true. And the OP_FREE_STATEID operation will not
release the deleg.

Thanks for the research. If the maintainers involved agree, we'll be
glad to add a .vulnerable file to our git repo and regenerate the json
entry to reflect this starting point for the issue.

Hi Greg,

As mentioned earlier, our reviewers felt that this bug would indeed be
difficult or impossible to reproduce before 83e733161fde, and there
have been no reports of similar crash symptoms in kernels before v6.9.

No objection to updating the CVE to reflect that.


--
Chuck Lever