[BUG] sunrpc: cache_seq_start_rcu() slab-out-of-bounds (unpriv-userns reachable)
From: Farhad Alemi
Date: Thu May 28 2026 - 13:02:04 EST
Hello Chuck and the linux-nfs team,
I am reporting a sunrpc cache slab-out-of-bounds read found by syzkaller.
Flagging up front because the read site is reachable from an unprivileged
user via unshare(CLONE_NEWUSER|CLONE_NEWNET) on distros that enable
unprivileged user-namespace creation by default.
Summary:
A read(2) on any sunrpc-cache seq_file (e.g. the nfsd virtual
filesystem's /exports, or /proc/net/rpc/<cache>/content) drives
cache_seq_start_rcu() -> __cache_seq_start() at net/sunrpc/cache.c:1351.
After the iterator walks all hash buckets, __cache_seq_start writes back
*pos = ((long long)cd->hash_size << 32) + 1 (the literal final-line
assignment in the function). The next read re-enters __cache_seq_start
with that *pos, decodes hash = cd->hash_size, and the very first
hlist_for_each_entry_rcu() walks &cd->hash_table[hash] one element past
the array end. The existing "while (!ch && ++hash < cd->hash_size)"
check only guards later iterations within the function, not the entry
hlist_for_each_entry_rcu().
The KASAN-reported 2K allocation is cd->hash_table allocated by
kzalloc_objs(struct hlist_head, cd->hash_size) in cache_create_net() at
net/sunrpc/cache.c:1733; struct cache_detail itself is a separate,
smaller kmemdup at line 1729. The 8-byte OOB read at the right edge of
the 2048-byte region reads hash_table[cd->hash_size] -- one struct
hlist_head past the end of the table.
Observed on:
- Linux v7.1-rc3-200-g70eda68668d1-dirty, x86_64, QEMU Q35
- KASAN enabled; panic_on_warn set
- The only local dirty file in my tree is drivers/tty/serial/serial_core.c,
containing a local ttyS0 console guard for the fuzzing harness. It is
unrelated to net/sunrpc/.
- __cache_seq_start() at net/sunrpc/cache.c:1351 still walks
&cd->hash_table[hash] without bounding the decoded hash against
cd->hash_size before the first iteration; bug remains reachable on
current mainline.
Impact:
A user with CAP_SYS_ADMIN to mount(2) the nfsd virtual filesystem can
drive a slab-out-of-bounds read by performing a sequence of read(2)s on
/<mountpoint>/exports (or, by symmetry, on /proc/net/rpc/<cache>/content
for any sunrpc-cache populated in the netns) -- the seq_file iterator
exhausts the cache, *pos is rewritten to encode hash = cd->hash_size,
and the trailing read OOBs:
BUG: KASAN: slab-out-of-bounds in __cache_seq_start
net/sunrpc/cache.c:1351 [inline]
BUG: KASAN: slab-out-of-bounds in cache_seq_start_rcu+0x18d/0x3a0
net/sunrpc/cache.c:1399
Read of size 8 at addr ffff88811ac34800 by task syz.2.17/3610
The buggy address is located 0 bytes to the right of
allocated 2048-byte region [ffff88811ac34000, ffff88811ac34800)
Allocator (cache_detail->hash_table birth):
__kmalloc_noprof+0x361/0x760 mm/slub.c:5307
kzalloc_noprof include/linux/slab.h:1188 [inline]
cache_create_net+0x9d/0x230 net/sunrpc/cache.c:1733
nfsd_export_init+0x5e/0x1f0 fs/nfsd/export.c:1536
nfsd_net_init+0x55/0x4b0 fs/nfsd/nfsctl.c:2209
ops_init+0x361/0x5d0 net/core/net_namespace.c:137
setup_net+0x11d/0x350 net/core/net_namespace.c:446
copy_net_ns+0x3e7/0x570 net/core/net_namespace.c:579
create_new_namespaces+0x3ec/0x6a0 kernel/nsproxy.c:132
unshare_nsproxy_namespaces+0x14e/0x1a0 kernel/nsproxy.c:234
ksys_unshare+0x582/0x9f0 kernel/fork.c:3243
__x64_sys_unshare+0x3d/0x50 kernel/fork.c:3315
Crash path (read on /proc/net/rpc/<cache>/content):
__cache_seq_start net/sunrpc/cache.c:1351 [inline]
cache_seq_start_rcu+0x18d/0x3a0 net/sunrpc/cache.c:1399
seq_read_iter+0x3fc/0xe20 fs/seq_file.c:226
seq_read+0x36c/0x490 fs/seq_file.c:163
vfs_read+0x211/0xa70 fs/read_write.c:572
ksys_read+0x155/0x270 fs/read_write.c:717
Expected behavior:
__cache_seq_start() should bound the decoded "hash" index against
cd->hash_size before the first hlist_for_each_entry_rcu(). Today the
function does `hash = n >> 32` then directly indexes &cd->hash_table[hash]
with no bounds check; clamping or rejecting hash >= cd->hash_size at the
entry of the function fixes the OOB.
Threat model:
On distros with default user.max_user_namespaces > 0, an unprivileged
user can unshare(CLONE_NEWUSER|CLONE_NEWNET) to enter a user namespace
where they hold CAP_SYS_ADMIN of the new netns. nfsd_net_init() runs for
the new netns and populates the sunrpc caches; the same mount(2) of
"nfsd" + read sequence then reaches the OOB without real CAP_SYS_ADMIN
on the initial namespace.
The attached reproducer takes the simpler initial-namespace path: it
mounts the nfsd virtual filesystem as root and reads /<mountpoint>/exports
through enough reads (lseek + readv + read in the syz-generated C) to
let the seq_file iterator advance *pos past the last hash bucket, after
which the trailing read trips the OOB. The reproducer does not
explicitly invoke unshare(). I am happy to rebuild a
CLONE_NEWUSER+CLONE_NEWNET-only variant if that demonstration would
help.
Reproducer:
I attached the generated C reproducer as reproducer.c. I also attached the
syzkaller program as reproducer.syz and the console report as
crash-report.txt.
Novelty check:
I searched syzbot dashboard data across upstream, fixed, invalid, stable,
and Android namespaces, and searched lore.kernel.org for
"cache_seq_start_rcu", "__cache_seq_start", and the broader
"slab-out-of-bounds" + "sunrpc". I did not find an exact match.
CVE-2023-52623 (sunrpc cache suspicious RCU usage) is a different bug.
I appreciate your time and consideration, and I'm grateful for your
work on this subsystem.
Regards,
Farhad
==================================================================
BUG: KASAN: slab-out-of-bounds in __cache_seq_start net/sunrpc/cache.c:1351 [inline]
BUG: KASAN: slab-out-of-bounds in cache_seq_start_rcu+0x18d/0x3a0 net/sunrpc/cache.c:1399
Read of size 8 at addr ffff88811ac34800 by task syz.2.17/3610
CPU: 1 UID: 0 PID: 3610 Comm: syz.2.17 Not tainted 7.1.0-rc3-00200-g70eda68668d1-dirty #1 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_address_description+0x55/0x1e0 mm/kasan/report.c:378
print_report+0x58/0x70 mm/kasan/report.c:482
kasan_report+0x117/0x150 mm/kasan/report.c:595
__cache_seq_start net/sunrpc/cache.c:1351 [inline]
cache_seq_start_rcu+0x18d/0x3a0 net/sunrpc/cache.c:1399
seq_read_iter+0x3fc/0xe20 fs/seq_file.c:226
seq_read+0x36c/0x490 fs/seq_file.c:163
vfs_read+0x211/0xa70 fs/read_write.c:572
ksys_read+0x155/0x270 fs/read_write.c:717
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f892199778d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe94887328 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007f8921c25fa0 RCX: 00007f892199778d
RDX: 0000000000002020 RSI: 0000200000003f00 RDI: 0000000000000003
RBP: 00007f8921a3eb3d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f8921c25fa0 R14: 00007f8921c25fa0 R15: 00000000000014ff
</TASK>
Allocated by task 3244:
kasan_save_stack mm/kasan/common.c:57 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
kasan_kmalloc include/linux/kasan.h:263 [inline]
__do_kmalloc_node mm/slub.c:5295 [inline]
__kmalloc_noprof+0x361/0x760 mm/slub.c:5307
kmalloc_noprof include/linux/slab.h:954 [inline]
kzalloc_noprof include/linux/slab.h:1188 [inline]
cache_create_net+0x9d/0x230 net/sunrpc/cache.c:1733
nfsd_export_init+0x5e/0x1f0 fs/nfsd/export.c:1536
nfsd_net_init+0x55/0x4b0 fs/nfsd/nfsctl.c:2209
ops_init+0x361/0x5d0 net/core/net_namespace.c:137
setup_net+0x11d/0x350 net/core/net_namespace.c:446
copy_net_ns+0x3e7/0x570 net/core/net_namespace.c:579
create_new_namespaces+0x3ec/0x6a0 kernel/nsproxy.c:132
unshare_nsproxy_namespaces+0x14e/0x1a0 kernel/nsproxy.c:234
ksys_unshare+0x582/0x9f0 kernel/fork.c:3243
__do_sys_unshare kernel/fork.c:3317 [inline]
__se_sys_unshare kernel/fork.c:3315 [inline]
__x64_sys_unshare+0x3d/0x50 kernel/fork.c:3315
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88811ac34000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 0 bytes to the right of
allocated 2048-byte region [ffff88811ac34000, ffff88811ac34800)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11ac30
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x200000000000040(head|node=0|zone=2)
page_type: f5(slab)
raw: 0200000000000040 ffff888100042000 dead000000000100 dead000000000122
raw: 0000000000000000 0000000800080008 00000000f5000000 0000000000000000
head: 0200000000000040 ffff888100042000 dead000000000100 dead000000000122
head: 0000000000000000 0000000800080008 00000000f5000000 0000000000000000
head: 0200000000000003 fffffffffffffe01 00000000ffffffff 00000000ffffffff
head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000008
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 3244, tgid 3244 (syz-executor), ts 64162282694, free_ts 64057945180
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x231/0x280 mm/page_alloc.c:1858
prep_new_page mm/page_alloc.c:1866 [inline]
get_page_from_freelist+0x20c2/0x21c0 mm/page_alloc.c:3946
__alloc_frozen_pages_noprof+0x192/0x380 mm/page_alloc.c:5226
alloc_slab_page mm/slub.c:3278 [inline]
allocate_slab+0x7c/0x660 mm/slub.c:3467
new_slab mm/slub.c:3525 [inline]
refill_objects+0x33e/0x3d0 mm/slub.c:7255
refill_sheaf mm/slub.c:2816 [inline]
__pcs_replace_empty_main+0x326/0x730 mm/slub.c:4651
alloc_from_pcs mm/slub.c:4749 [inline]
slab_alloc_node mm/slub.c:4883 [inline]
__do_kmalloc_node mm/slub.c:5294 [inline]
__kmalloc_noprof+0x479/0x760 mm/slub.c:5307
kmalloc_noprof include/linux/slab.h:954 [inline]
sk_prot_alloc+0xec/0x220 net/core/sock.c:2247
sk_alloc+0x3f/0x390 net/core/sock.c:2303
__netlink_create+0x6a/0x270 net/netlink/af_netlink.c:626
__netlink_kernel_create+0x142/0x720 net/netlink/af_netlink.c:2018
netlink_kernel_create include/linux/netlink.h:62 [inline]
audit_net_init+0xcd/0x200 kernel/audit.c:1703
ops_init+0x361/0x5d0 net/core/net_namespace.c:137
setup_net+0x11d/0x350 net/core/net_namespace.c:446
copy_net_ns+0x3e7/0x570 net/core/net_namespace.c:579
create_new_namespaces+0x3ec/0x6a0 kernel/nsproxy.c:132
page last free pid 3250 tgid 3250 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
__free_pages_prepare mm/page_alloc.c:1402 [inline]
__free_frozen_pages+0xb4c/0xca0 mm/page_alloc.c:2943
__slab_free+0x279/0x2c0 mm/slub.c:5612
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x99/0x100 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350
kasan_slab_alloc include/linux/kasan.h:253 [inline]
slab_post_alloc_hook mm/slub.c:4569 [inline]
slab_alloc_node mm/slub.c:4898 [inline]
__do_kmalloc_node mm/slub.c:5294 [inline]
__kmalloc_noprof+0x31b/0x760 mm/slub.c:5307
kmalloc_noprof include/linux/slab.h:954 [inline]
tomoyo_realpath_from_path+0xe8/0x5e0 security/tomoyo/realpath.c:251
tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
tomoyo_path_perm+0x288/0x570 security/tomoyo/file.c:827
security_inode_getattr+0x119/0x290 security/security.c:1895
vfs_getattr fs/stat.c:259 [inline]
vfs_fstat fs/stat.c:281 [inline]
__do_sys_newfstat fs/stat.c:551 [inline]
__se_sys_newfstat fs/stat.c:546 [inline]
__x64_sys_newfstat+0x140/0x270 fs/stat.c:546
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x15f/0x560 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Memory state around the buggy address:
ffff88811ac34700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88811ac34780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88811ac34800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88811ac34880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88811ac34900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Attachment:
reproducer.syz
Description: Binary data
Attachment:
reproducer.c
Description: Binary data