Re: [syzbot] [mm?] [arch?] BUG: sleeping function called from invalid context in __tlb_batch_free_encoded_pages
From: David Hildenbrand (Arm)
Date: Mon Jun 01 2026 - 11:14:31 EST
On 5/27/26 14:05, Will Deacon wrote:
> I finally got a chance to look at this one...
>
> On Thu, Apr 30, 2026 at 12:21:35AM -0700, syzbot wrote:
>> syzbot found the following issue on:
>>
>> HEAD commit: dca922e019dd Merge tag 'xsa48x-7.1-tag' of git://git.kerne..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=11cd6b6c580000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a169a27b0538ba43e5d3
>> compiler: gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-dca922e0.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/7b447b1b93a9/vmlinux-dca922e0.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/af7830f5dabf/bzImage-dca922e0.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+a169a27b0538ba43e5d3@xxxxxxxxxxxxxxxxxxxxxxxxx
>>
>> BUG: sleeping function called from invalid context at mm/mmu_gather.c:142
>> in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5677, name: rm
>> preempt_count: 0, expected: 0
>> RCU nest depth: 1, expected: 0
>> 2 locks held by rm/5677:
>> #0: ffff888022c20338 (&mm->mmap_lock){++++}-{4:4}, at: mmap_write_lock include/linux/mmap_lock.h:536 [inline]
>> #0: ffff888022c20338 (&mm->mmap_lock){++++}-{4:4}, at: exit_mmap+0x22c/0xa10 mm/mmap.c:1308
>> #1: ffffffff8e7e54e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire.constprop.0+0x7/0x30 include/linux/rcupdate.h:300
>> CPU: 1 UID: 0 PID: 5677 Comm: rm Not tainted syzkaller #0 PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> Call Trace:
>> <TASK>
>> __dump_stack lib/dump_stack.c:94 [inline]
>> dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
>> __might_resched.cold+0x1ec/0x232 kernel/sched/core.c:9162
>> __tlb_batch_free_encoded_pages+0x11e/0x280 mm/mmu_gather.c:142
>> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
>
> ... so we're getting deep in the gather code tearing down an mm, where
> we free the unmapped pages but we're somehow inside an RCU read-side
> critical section.
>
> The last report on the sysbot dashboard is from 30th April, which
> coincides with 99ebc509eef5 ("mm: memcontrol: fix rcu unbalance in
> get_non_dying_memcg_end()") landing upstream. Unfortunately, there's no
> reproducer available to test that concretely but it looks like we can
> end up in there via the page freeing path. So hopefully this is fixed.
Yes, sounds reasonable.
--
Cheers,
David