Re: [syzbot] unexpected kernel reboot (4)

From: Dmitry Vyukov
Date: Thu Apr 22 2021 - 10:20:35 EST


On Thu, Apr 22, 2021 at 12:16 PM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On 2021/04/15 1:16, Tetsuo Handa wrote:
> > On 2021/04/15 0:39, Andrey Konovalov wrote:
> >> On Wed, Apr 14, 2021 at 7:45 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >>>
> >>> On Tue, Apr 13, 2021 at 11:27 PM syzbot
> >>> <syzbot+9ce030d4c89856b27619@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> syzbot found the following issue on:
> >>>>
> >>>> HEAD commit: 89698bec Merge tag 'm68knommu-for-v5.12-rc7' of git://git...
> >>>> git tree: upstream
> >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1243fcfed00000
> >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=b234ddbbe2953747
> >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=9ce030d4c89856b27619
> >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=173e92fed00000
> >>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1735da2ed00000
> >>>>
> >>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>> Reported-by: syzbot+9ce030d4c89856b27619@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>>>
> >>>> output_len: 0x000000000e74eb68
> >>>> kernel_total_size: 0x000000000f226000
> >>>> needed_size: 0x000000000f400000
> >>>> trampoline_32bit: 0x000000000009d000
> >>>> Decompressing Linux... Parsing ELF... done.
> >>>> Booting the kernel.
> >>>
> >>> +linux-input
> >>>
> >>> The reproducer connects some USB HID device and communicates with the driver.
> >>> Previously we observed reboots because HID devices can trigger reboot
> >>> SYSRQ, but we disable it with "CONFIG_MAGIC_SYSRQ is not set".
> >>> How else can a USB device reboot the machine? Is it possible to disable it?
> >>> I don't see any direct includes of <linux/reboot.h> in drivers/usb/*
> >>
> >> This happens when a keyboard sends the Ctrl+Alt+Del sequence, see
> >> fn_boot_it()->ctrl_alt_del() in drivers/tty/vt/keyboard.c.
>
> Hmm, maybe the reproducer I use and "#syz test:" uses differs.
> But since "#syz test:" did not trigger the problem if
> https://syzkaller.appspot.com/x/patch.diff?x=14ba0851d00000 is applied,
> can we add
>
> if (fork() == 0) {
> char buf[20] = { };
> int fd = open("/proc/sys/kernel/ctrl-alt-del", O_WRONLY);
> write(fd, "0\n", 2);
> close(fd);
> fd = open("/proc/sys/kernel/cad_pid", O_WRONLY);
> snprintf(buf, sizeof(buf) - 1, "%d\n", getpid());
> write(fd, buf, strlen(buf));
> close(fd);
> }
>
> to the common setup function? This will serve as a temporary workaround
> until Linus accepts disable-specific-functionality changes.
>
> There is no need to keep the process referenced by /proc/sys/kernel/cad_pid alive,
> for "struct pid" which can remain after the process terminates is saved there.

I've prepared this syzkaller change:
https://github.com/google/syzkaller/pull/2550/files

Re hibernation/suspend configs, you said disabling them is not
helping, right? Does it still make sense to disable them?
If these configs are enabled, we can at least find some bugs in the
preparation for suspend code. However, as you noted, it will
immediately lead to "lost connection".
Ideally we somehow tweak hibernation/suspend to get to the
hibernation/suspend point and then immediately and automatically
resume. This way we could test both suspend and unsuspend code, which
I assume can lead to bugs, and don't cause "lost connection" at the
same time. I guess such a mode does not exist today... and I am not
sure what happens with TCP connections after this.