Re: [syzbot] [mm?] BUG: stack guard page was hit in compat_sys_open

From: Kirill A. Shutemov
Date: Mon Oct 14 2024 - 07:38:24 EST


On Sun, Oct 13, 2024 at 11:01:33PM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 27cc6fdf7201 Merge tag 'linux_kselftest-fixes-6.12-rc2' of..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13043307980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e3e4d87a80ed4297
> dashboard link: https://syzkaller.appspot.com/bug?extid=0e1748603cc9f2dfc87d
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: i386
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-27cc6fdf.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/ae2f7d656e32/vmlinux-27cc6fdf.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/1b06a62cc1e5/bzImage-27cc6fdf.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+0e1748603cc9f2dfc87d@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> BUG: TASK stack guard page was hit at ffffc90002b3ffb8 (stack is ffffc90002b40000..ffffc90002b48000)
> Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 0 UID: 0 PID: 12425 Comm: syz.2.2179 Not tainted 6.12.0-rc1-syzkaller-00306-g27cc6fdf7201 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:mark_lock+0x25/0xc60 kernel/locking/lockdep.c:4686
> Code: 90 90 90 90 90 55 48 89 e5 41 57 41 56 41 89 d6 48 ba 00 00 00 00 00 fc ff df 41 55 41 54 53 48 83 e4 f0 48 81 ec 10 01 00 00 <48> c7 44 24 30 b3 8a b5 41 48 8d 44 24 30 48 c7 44 24 38 38 51 57
> RSP: 0018:ffffc90002b3ffc0 EFLAGS: 00010086
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000003
> RDX: dffffc0000000000 RSI: ffff888021edaf98 RDI: ffff888021eda440
> RBP: ffffc90002b40100 R08: 0000000000000000 R09: 0000000000000006
> R10: ffffffff9698ad37 R11: 0000000000000002 R12: dffffc0000000000
> R13: ffff888021edaf98 R14: 0000000000000008 R15: ffff888021eda440
> FS: 0000000000000000(0000) GS:ffff88802b400000(0063) knlGS:00000000f56f6b40
> CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: ffffc90002b3ffb8 CR3: 000000005f61e000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <#DF>
> </#DF>
> <TASK>
> mark_usage kernel/locking/lockdep.c:4646 [inline]
> __lock_acquire+0x906/0x3ce0 kernel/locking/lockdep.c:5156
> lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5825
> rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
> rcu_read_lock include/linux/rcupdate.h:849 [inline]
> page_ext_get+0x3a/0x310 mm/page_ext.c:525
> __set_page_owner+0x9a/0x790 mm/page_owner.c:322
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1537
> prep_new_page mm/page_alloc.c:1545 [inline]
> get_page_from_freelist+0x101e/0x3070 mm/page_alloc.c:3457
> __alloc_pages_noprof+0x223/0x25c0 mm/page_alloc.c:4733
> alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265
> alloc_slab_page mm/slub.c:2412 [inline]
> allocate_slab mm/slub.c:2578 [inline]
> new_slab+0x2ba/0x3f0 mm/slub.c:2631
> ___slab_alloc+0xd1d/0x16f0 mm/slub.c:3818
> __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908
> __slab_alloc_node mm/slub.c:3961 [inline]
> slab_alloc_node mm/slub.c:4122 [inline]
> kmem_cache_alloc_noprof+0x2ae/0x2f0 mm/slub.c:4141
> p9_tag_alloc+0x9c/0x870 net/9p/client.c:281
> p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644
> p9_client_zc_rpc.constprop.0+0x105/0x880 net/9p/client.c:793
> p9_client_read_once+0x443/0x820 net/9p/client.c:1560
> p9_client_read+0x13f/0x1b0 net/9p/client.c:1524
> v9fs_issue_read+0x115/0x310 fs/9p/vfs_addr.c:74
> netfs_retry_read_subrequests fs/netfs/read_retry.c:60 [inline]
> netfs_retry_reads+0x153a/0x1d00 fs/netfs/read_retry.c:232
> netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:369
> netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:405
> netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
> netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:369
> netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:405
> netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
> netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:369
> netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:405
> netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
> netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:369
> netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:405
> netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
> netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:369
> netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:405
> netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235
> netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:369
> netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:405
...


This recursion looks borken:

netfs_rreq_terminated
netfs_rreq_assess
netfs_retry_reads
netfs_rreq_terminated

It ate the stack.

Seems to be introduced in ee4cdf7ba857 ("netfs: Speed up buffered
reading").

David?

--
Kiryl Shutsemau / Kirill A. Shutemov