Re: [syzbot] [mm?] kernel BUG in const_folio_flags
From: Muchun Song
Date: Thu Mar 21 2024 - 05:50:39 EST
> On Mar 21, 2024, at 12:04, syzbot <syzbot+3b9148f91b7869120e81@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 78c3925c048c Merge tag 'soc-late-6.9' of git://git.kernel...
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1267d879180000
> kernel config: https://syzkaller.appspot.com/x/.config?x=f3c2635ded15fbc9
> dashboard link: https://syzkaller.appspot.com/bug?extid=3b9148f91b7869120e81
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: i386
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-78c3925c.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/cf2bceeccde3/vmlinux-78c3925cxz
> kernel image: https://storage.googleapis.com/syzbot-assets/fc938dfaea6d/bzImage-78c3925cxz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3b9148f91b7869120e81@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> veth_newlink+0x627/0xa10 drivers/net/veth.c:1895
> rtnl_newlink_create net/core/rtnetlink.c:3494 [inline]
> __rtnl_newlink+0x119c/0x1960 net/core/rtnetlink.c:3714
> rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3727
> rtnetlink_rcv_msg+0x3c7/0xe60 net/core/rtnetlink.c:6595
> ------------[ cut here ]------------
> kernel BUG at include/linux/page-flags.h:315!
There are some more page dumping information from console:
[ 61.367144][ T42] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff888028132880 pfn:0x28130
[ 61.371430][ T42] flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff)
[ 61.374455][ T42] page_type: 0xffffffff()
[ 61.376096][ T42] raw: 00fff80000000000 ffff888015ecd540 dead000000000100 0000000000000000
[ 61.379994][ T42] raw: ffff888028132880 0000000000190000 00000000ffffffff 0000000000000000
Alright, the page is freed (with a refcount of 0).
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 1 PID: 42 Comm: kcompactd0 Not tainted 6.8.0-syzkaller-11725-g78c3925c048c #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:const_folio_flags+0x1bd/0x1f0 include/linux/page-flags.h:315
The RIP is in const_folio_flags() (called from folio_test_hugetlb()):
VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
It is reasonable to WARN because the page is freed (PG_head is not set
in this case).
The comments from folio_test_hugetlb() says "Caller should have a
reference on the folio", so the caller of PageHuge() should grab
a refcount before calling folio_test_hugetlb() since commit
9c5ccf2db04b. But it does not mean that the @page must be a HugeTLB page
even if PageHuge(@page) returns true when the user does not hold
a extra refcount on the @page. Seems the WARN could be acceptable, so
should we remove this WARN? I am not sure. Cc more experts.
Thanks.
> Code: 41 83 e4 01 44 89 e6 e8 b1 e6 a9 ff 45 84 e4 0f 85 c4 fe ff ff e8 23 ec a9 ff 48 c7 c6 e0 07 1b 8b 48 89 ef e8 34 2e ed ff 90 <0f> 0b e8 8c 6b 06 00 e9 66 fe ff ff 48 89 ef e8 7f 6b 06 00 eb b6
> RSP: 0018:ffffc9000068f7f0 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffc9000068f698
> RDX: ffff88801744c880 RSI: ffffffff81e4265c RDI: ffffffff8b6f0060
> RBP: ffffea0000a04c00 R08: 0000000000000000 R09: fffffbfff1f3deca
> R10: ffffffff8f9ef657 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffea0000a04dc0 R14: 0000000000028137 R15: ffffc9000068fbe8
> FS: 0000000000000000(0000) GS:ffff88802c300000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffe623b9138 CR3: 000000001c22c000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> folio_test_hugetlb include/linux/page-flags.h:875 [inline]
> PageHuge+0x219/0x2b0 mm/hugetlb.c:2174
> isolate_migratepages_block+0x4a0/0x5110 mm/compaction.c:1004
> isolate_migratepages mm/compaction.c:2182 [inline]
> compact_zone+0x1a5c/0x4280 mm/compaction.c:2629
> kcompactd_do_work+0x340/0x720 mm/compaction.c:3100
> kcompactd+0x8d7/0xde0 mm/compaction.c:3199
> kthread+0x2c1/0x3a0 kernel/kthread.c:388
> ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
> </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:const_folio_flags+0x1bd/0x1f0 include/linux/page-flags.h:315
> Code: 41 83 e4 01 44 89 e6 e8 b1 e6 a9 ff 45 84 e4 0f 85 c4 fe ff ff e8 23 ec a9 ff 48 c7 c6 e0 07 1b 8b 48 89 ef e8 34 2e ed ff 90 <0f> 0b e8 8c 6b 06 00 e9 66 fe ff ff 48 89 ef e8 7f 6b 06 00 eb b6
> RSP: 0018:ffffc9000068f7f0 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffc9000068f698
> RDX: ffff88801744c880 RSI: ffffffff81e4265c RDI: ffffffff8b6f0060
> RBP: ffffea0000a04c00 R08: 0000000000000000 R09: fffffbfff1f3deca
> R10: ffffffff8f9ef657 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffea0000a04dc0 R14: 0000000000028137 R15: ffffc9000068fbe8
> FS: 0000000000000000(0000) GS:ffff88802c300000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffe623b9138 CR3: 000000001c22c000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup