Re: [syzbot ci] Re: mm: switch THP shrinker to list_lru

From: Johannes Weiner

Date: Fri Mar 13 2026 - 19:08:43 EST


On Fri, Mar 13, 2026 at 10:39:38AM -0700, syzbot ci wrote:
> ------------[ cut here ]------------
> !css_is_dying(&memcg->css)
> WARNING: mm/list_lru.c:110 at lock_list_lru_of_memcg+0x33d/0x470 mm/list_lru.c:110, CPU#0: syz.0.17/5950
> Modules linked in:
> CPU: 0 UID: 0 PID: 5950 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:lock_list_lru_of_memcg+0x33d/0x470 mm/list_lru.c:110
> Code: 3c 28 00 74 08 4c 89 e7 e8 b0 02 1d 00 4d 8b 24 24 48 8b 54 24 20 4d 85 e4 0f 85 00 fe ff ff e9 75 fe ff ff e8 d4 df b3 ff 90 <0f> 0b 90 eb c1 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c 06 fe ff ff 48
> RSP: 0018:ffffc90004017110 EFLAGS: 00010093
> RAX: ffffffff8211b3ac RBX: 0000000000000000 RCX: ffff888104f057c0
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffff888104f057c0 R09: 0000000000000002
> R10: 0000000000000406 R11: 0000000000000000 R12: ffff8881026d0d00
> R13: dffffc0000000000 R14: ffffffff9a2de05c R15: 0000000000000002
> FS: 0000555572bfe500(0000) GS:ffff88818de66000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000200000001000 CR3: 0000000112554000 CR4: 00000000000006f0
> Call Trace:
> <TASK>
> __folio_freeze_and_split_unmapped+0x2ab/0x34b0 mm/huge_memory.c:3767
> __folio_split+0xae1/0x1570 mm/huge_memory.c:4033
> try_folio_split_to_order include/linux/huge_mm.h:411 [inline]
> try_folio_split_or_unmap+0x5b/0x1e0 mm/truncate.c:189
> truncate_inode_partial_folio+0x4ab/0x8e0 mm/truncate.c:255

File pages aren't on the deferred_split_lru. We're calling
list_lru_lock() on a nid+memcg combination that doesn't have list_lru
heads allocated. This should either fail gracefully or needs page type
filtering in __folio_freeze_and_split_unmapped(). Needs more thought.

> possible deadlock in __folio_end_writeback
>
> =====================================================
> WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> syzkaller #0 Not tainted
> -----------------------------------------------------
> syz.0.17/5949 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> ffff88810c90c240 (&l->lock){+.+.}-{3:3}, at: spin_lock include/linux/spinlock.h:341 [inline]
> ffff88810c90c240 (&l->lock){+.+.}-{3:3}, at: lock_list_lru mm/list_lru.c:26 [inline]
> ffff88810c90c240 (&l->lock){+.+.}-{3:3}, at: lock_list_lru_of_memcg+0x268/0x470 mm/list_lru.c:95
>
> and this task is already holding:
> ffff8881107ad160 (&xa->xa_lock#9){..-.}-{3:3}, at: spin_lock include/linux/spinlock.h:341 [inline]
> ffff8881107ad160 (&xa->xa_lock#9){..-.}-{3:3}, at: __folio_split+0xa2e/0x1570 mm/huge_memory.c:4025
> which would create a new lock dependency:
> (&xa->xa_lock#9){..-.}-{3:3} -> (&l->lock){+.+.}-{3:3}
>
> but this new dependency connects a SOFTIRQ-irq-safe lock:
> (&xa->xa_lock#9){..-.}-{3:3}
>
> ... which became SOFTIRQ-irq-safe at:
> lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868
> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:132 [inline]
> _raw_spin_lock_irqsave+0x40/0x60 kernel/locking/spinlock.c:162
> __folio_end_writeback+0x157/0x770 mm/page-writeback.c:2946
>
> to a SOFTIRQ-irq-unsafe lock:
> (&l->lock){+.+.}-{3:3}
>
> ... which became SOFTIRQ-irq-unsafe at:
> ...
> lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868
> __raw_spin_lock include/linux/spinlock_api_smp.h:158 [inline]
> _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
> spin_lock include/linux/spinlock.h:341 [inline]
> lock_list_lru mm/list_lru.c:26 [inline]
> lock_list_lru_of_memcg+0x268/0x470 mm/list_lru.c:95
> list_lru_lock mm/list_lru.c:154 [inline]
> list_lru_add+0x46/0x260 mm/list_lru.c:208
> list_lru_add_obj+0x191/0x270 mm/list_lru.c:221
> d_lru_add+0xd6/0x160 fs/dcache.c:497

Different locks, deferred_split_lru needs its own lockdep key.