Re: [PATCH 0/6] SUNRPC: Address remaining cache_check_rcu() UAF in cache content files
From: Chuck Lever
Date: Sun May 10 2026 - 12:19:27 EST
Hi Erkun -
On Sat, May 9, 2026, at 5:41 AM, yangerkun wrote:
> Hmm... /proc/net is always a symlink to /proc/self/net. After opening
> /proc/net/rpc/<cache>/content and attempting to read it, the
> proc_reg_read function calls use_pde before pde_read. This sequence can
> prevent a race condition because nfsd_export_shutdown leads to
> cache_unregister_net, which calls remove_cache_proc_entries, then
> proc_remove, and eventually proc_entry_rundown. The proc_entry_rundown
> function waits until unuse_pde is called in proc_reg_read. Therefore,
> I'm not sure if forgetting to call get_net when opening
> /proc/net/rpc/<cache>/content is the root cause of the null pointer in
> c_show.
Walked the synchronization. You're right.
cache_unregister_net() calls remove_cache_proc_entries(),
which runs proc_remove(); remove_proc_subtree() then invokes
proc_entry_rundown() on each per-cache file. Rundown does
atomic_add_return(BIAS, &de->in_use), where BIAS = -1U << 31.
No active readers means the post-add value equals BIAS and
rundown returns at once. Readers present means the value
exceeds BIAS, and wait_for_completion() blocks until the last
unuse_pde() decrements the counter to exactly BIAS and signals
the completion. atomic_inc_unless_negative() in use_pde() then
fails, so any later read() on a still-open userspace fd
returns -EIO without touching cd. close_pdeo() forces release
on the remaining openers while cd is still valid.
cache_destroy_net() runs only after that whole sequence has
finished, so cd->hash_table is freed once no reader can be
inside cache_seq_*_rcu() and no fd can dereference cd through
a release callback.
The 5/6 changelog overstates the window. Your reproducer
opens /proc/fs/nfs/exports through exports_nfsd_open(), which
bypasses use_pde() and is the path e7fcf179b82d closed. The
sunrpc cache files reach c_show through proc_reg_read(), which
goes through use_pde()/unuse_pde() and is covered by rundown.
5/6 doesn't close the hazard its changelog describes.
Patch 3/6 is what matches Misbah's reproducer. Pre-series
ip_map_put() drops auth_domain_put() synchronously, with only
the ip_map free deferred:
auth_domain_put(&im->m_client->h); /* synchronous */
kfree_rcu(im, m_rcu);
A reader walking auth.unix.ip/content under rcu_read_lock()
can dereference im->m_client after the auth_domain has been
freed. Same shape as 48db892356d6's svc_export fix, applied
to ip_map. 3/6 moves auth_domain_put() into a deferred
ip_map_release() scheduled via queue_rcu_work(), so the
sub-object free waits for the grace period.
For v2: re-test Misbah's reproducer with patches 1-4 and 6
only and see whether 3/6 alone closes the crash. If it does,
drop 5/6; if it doesn't, reframe 5/6 as a consistency change
without the UAF claim (and without the behavioral change that
pins a netns alive while a debug fd is open). Either way, the
cover letter needs a rewrite to match.
Thanks for your analysis and review.
--
Chuck Lever