Re: [syzbot] kernel panic: kernel stack overflow
From: Jiri Pirko
Date: Thu Oct 13 2022 - 03:11:38 EST
Wed, Oct 12, 2022 at 06:42:39PM CEST, edumazet@xxxxxxxxxx wrote:
>On Wed, Oct 12, 2022 at 8:08 AM Jiri Pirko <jiri@xxxxxxxxxxx> wrote:
>>
>> Wed, Oct 12, 2022 at 03:54:59PM CEST, dvyukov@xxxxxxxxxx wrote:
>> >On Wed, 12 Oct 2022 at 15:11, Jiri Pirko <jiri@xxxxxxxxxxx> wrote:
>> >>
>> >> Wed, Oct 12, 2022 at 09:53:27AM CEST, dvyukov@xxxxxxxxxx wrote:
>> >> >On Wed, 12 Oct 2022 at 09:48, syzbot
>> >> ><syzbot+60748c96cf5c6df8e581@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>> >> >>
>> >> >> Hello,
>> >> >>
>> >> >> syzbot found the following issue on:
>> >> >>
>> >> >> HEAD commit: bbed346d5a96 Merge branch 'for-next/core' into for-kernelci
>> >> >> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
>> >> >> console output: https://syzkaller.appspot.com/x/log.txt?x=14a03a2a880000
>> >> >> kernel config: https://syzkaller.appspot.com/x/.config?x=aae2d21e7dd80684
>> >> >> dashboard link: https://syzkaller.appspot.com/bug?extid=60748c96cf5c6df8e581
>> >> >> compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
>> >> >> userspace arch: arm64
>> >> >>
>> >> >> Unfortunately, I don't have any reproducer for this issue yet.
>> >> >>
>> >> >> Downloadable assets:
>> >> >> disk image: https://storage.googleapis.com/syzbot-assets/11078f50b80b/disk-bbed346d.raw.xz
>> >> >> vmlinux: https://storage.googleapis.com/syzbot-assets/398e5f1e6c84/vmlinux-bbed346d.xz
>> >> >>
>> >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> >> >> Reported-by: syzbot+60748c96cf5c6df8e581@xxxxxxxxxxxxxxxxxxxxxxxxx
>> >> >
>> >> >+Jiri
>> >> >
>> >> >It looks like the issue is with the team device. It seems to call
>> >> >itself infinitely.
>> >> >team_device_event was mentioned in stack overflow bugs in the past:
>> >> >https://groups.google.com/g/syzkaller-bugs/search?q=%22team_device_event%22
>> >>
>> >> Hi, do you have dmesg output available by any chance?
>> >
>> >Hi Jiri,
>> >
>> >syzbot attaches dmesg output to every report under the "console output" link.
>>
>> I see. I guess the debug messages are not printed out, I don't see them
>> there. Would it be possible to turn them on?
>
>What debug messages do you need ?
>
>There is a nice stack trace [1] with file:number available
Sure, but there are no debug printks that are printed out during feature
processing. That could shed some light on if this is caused by lack of
nest level enforce or perhaps for some reason repetitive processing
of the same team-port netdevice couple in loop.
>
>
>My guess was that for some reason the team driver does not enforce a
>max nest level of 8 ?
>
>https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=65921376425fc9c8b7ce647e1f7989f7cdf5dd70
>
>
>[1]
>CPU: 1 PID: 16874 Comm: syz-executor.3 Not tainted
>6.0.0-rc7-syzkaller-18095-gbbed346d5a96 #0
>Hardware name: Google Google Compute Engine/Google Compute Engine,
>BIOS Google 09/30/2022
>Call trace:
> dump_backtrace+0x1c4/0x1f0 arch/arm64/kernel/stacktrace.c:156
> show_stack+0x2c/0x54 arch/arm64/kernel/stacktrace.c:163
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x104/0x16c lib/dump_stack.c:106
> dump_stack+0x1c/0x58 lib/dump_stack.c:113
> panic+0x218/0x50c kernel/panic.c:274
> nmi_panic+0xbc/0xf0 kernel/panic.c:169
> panic_bad_stack+0x134/0x154 arch/arm64/kernel/traps.c:906
> handle_bad_stack+0x34/0x48 arch/arm64/kernel/entry-common.c:848
> __bad_stack+0x78/0x7c arch/arm64/kernel/entry.S:549
> mark_lock+0x4/0x1b4 kernel/locking/lockdep.c:4593
> lock_acquire+0x100/0x1f8 kernel/locking/lockdep.c:5666
> do_write_seqcount_begin_nested include/linux/seqlock.h:516 [inline]
> do_write_seqcount_begin include/linux/seqlock.h:541 [inline]
> psi_group_change+0x128/0x3d0 kernel/sched/psi.c:705
> psi_task_switch+0x9c/0x310 kernel/sched/psi.c:851
> psi_sched_switch kernel/sched/stats.h:194 [inline]
> __schedule+0x554/0x5a0 kernel/sched/core.c:6489
> preempt_schedule_irq+0x64/0x110 kernel/sched/core.c:6806
> arm64_preempt_schedule_irq arch/arm64/kernel/entry-common.c:265 [inline]
> __el1_irq arch/arm64/kernel/entry-common.c:473 [inline]
> el1_interrupt+0x4c/0x68 arch/arm64/kernel/entry-common.c:485
> el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:490
> el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:577
> arch_local_irq_restore+0x8/0x10 arch/arm64/include/asm/irqflags.h:122
> lock_is_held include/linux/lockdep.h:283 [inline]
> __might_resched+0x7c/0x218 kernel/sched/core.c:9854
> __might_sleep+0x48/0x78 kernel/sched/core.c:9821
> might_alloc include/linux/sched/mm.h:274 [inline]
> slab_pre_alloc_hook mm/slab.h:700 [inline]
> slab_alloc_node mm/slub.c:3162 [inline]
> kmem_cache_alloc_node+0x80/0x370 mm/slub.c:3298
> __alloc_skb+0xf8/0x378 net/core/skbuff.c:422
> alloc_skb include/linux/skbuff.h:1257 [inline]
> nlmsg_new include/net/netlink.h:953 [inline]
> genlmsg_new include/net/genetlink.h:410 [inline]
> ethnl_default_notify+0x16c/0x320 net/ethtool/netlink.c:640
>...