Re: WARNING in task_participate_group_stop

From: Oleg Nesterov
Date: Tue Oct 31 2017 - 12:35:01 EST


On 10/30, Dmitry Vyukov wrote:
>
> On Mon, Oct 30, 2017 at 10:12 PM, syzbot
> <bot+c9f0eb0d2a5576ece331a767528e6b52b4ff1815@xxxxxxxxxxxxxxxxxxxxxxxxx>
> wrote:
> > Hello,
> >
> > syzkaller hit the following crash on
> > d95e159cd1da1ed4dbf76bf203e8ffaf231395e4
> > git://git.cmpxchg.org/linux-mmots.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> > C reproducer is attached

Hmm. I do not see reproducer in this email...

> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > for information about syzkaller reproducers
>
> This also happens on more recent commits, including linux-next
> 36ef71cae353f88fd6e095e2aaa3e5953af1685d (Oct 19) and upstream
> 3e0cc09a3a2c40ec1ffb6b4e12da86e98feccb11 (Oct 18).
>
> > WARNING: CPU: 0 PID: 1 at kernel/signal.c:340
> > task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340
> > Kernel panic - not syncing: panic_on_warn set ...
> >
> > CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-mm1+ #5

Looks familiar... I need some time to recall the details, will try to send
the fix(es) this week.

So this is init process with SIGNAL_UNKILLABLE flag set. And I hope it has
the pending SIGKILL, otherwise there is something else.

IIRC the problem is that complete_signal(SIGKILL) does nothing if
SIGNAL_UNKILLABLE is set, in particular it doesn't set SIGNAL_GROUP_EXIT.
This fools the signal_group_exit() check in do_signal_stop().

Actually there are more problems with SIGNAL_UNKILLABLE && SIGKILL, we need
some nasty cleanups.

Oleg.


> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> > __dump_stack lib/dump_stack.c:16 [inline]
> > dump_stack+0x194/0x257 lib/dump_stack.c:52
> > panic+0x1e4/0x417 kernel/panic.c:181
> > __warn+0x1c4/0x1d9 kernel/panic.c:542
> > report_bug+0x211/0x2d0 lib/bug.c:183
> > fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
> > do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
> > do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
> > do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
> > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
> > invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
> > RIP: 0010:task_participate_group_stop+0x1ce/0x230 kernel/signal.c:340
> > RSP: 0018:ffff8801d9ee77f0 EFLAGS: 00010097
> > RAX: ffff8801d9ed8040 RBX: ffff8801d9ed8040 RCX: ffff8801d9edb2c0
> > RDX: 0000000000000000 RSI: 0000000000060013 RDI: ffff8801d9ed84d0
> > RBP: ffff8801d9ee7808 R08: ffff8801d9ee7180 R09: ffff8801d9ee7178
> > R10: ffff8801d9ee70f0 R11: 1ffff1003b3db29b R12: ffff8801d9ee9740
> > R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8801d9ed85c8
> > do_signal_stop+0x217/0x900 kernel/signal.c:2042
> > get_signal+0x61c/0x17e0 kernel/signal.c:2297
> > do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
> > exit_to_usermode_loop+0x224/0x300 arch/x86/entry/common.c:158
> > prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
> > syscall_return_slowpath+0x42f/0x500 arch/x86/entry/common.c:266
> > entry_SYSCALL_64_fastpath+0xbc/0xbe
> > RIP: 0033:0x7f33f723fdd3
> > RSP: 002b:00007fffb5303398 EFLAGS: 00000246 ORIG_RAX: 0000000000000017
> > RAX: fffffffffffffdfe RBX: 00007fffb5303540 RCX: 00007f33f723fdd3
> > RDX: 0000000000000000 RSI: 00007fffb53036f0 RDI: 000000000000000b
> > RBP: 00007fffb53036f0 R08: 00007fffb5303770 R09: 0000000000000001
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
> > R13: 00007fffb5303ad0 R14: 0000000000000000 R15: 0000000000000000
> >
> >
> > ---
> > This bug is generated by a dumb bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for details.
> > Direct all questions to syzkaller@xxxxxxxxxxxxxxxxx
> >
> > syzbot will keep track of this bug report.
> > Once a fix for this bug is committed, please reply to this email with:
> > #syz fix: exact-commit-title
> > To mark this as a duplicate of another syzbot report, please reply with:
> > #syz dup: exact-subject-of-another-report
> > If it's a one-off invalid bug report, please reply with:
> > #syz invalid
> > Note: if the crash happens again, it will cause creation of a new bug
> > report.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxxx
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/syzkaller-bugs/94eb2c058c80ea49ed055cc8695e%40google.com.
> > For more options, visit https://groups.google.com/d/optout.