Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit

From: chenqiwu
Date: Sat Dec 14 2019 - 01:40:13 EST


On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote:
> On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> > can't you use is_global_init() && group_dead ?
>
> Seems reasonable.
> Looks like we can move
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> further up...
>
> (Ideally I'd like to have a test for this to ensure that this lets you
> capture a global init coredump but that might be tricky. But since you've
> seem to have run into this case maybe you even have something that could
> be turned into a test? (Similar to how we already have a purely opt-in
> test for pstore.))
>
> Christian

Hi all,
I agree that using is_global_init() && group_dead is more reasonable.

The crash isuee happened on a Android phone by reboot stress test.
panic log:
[ 84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 84.048521]
[ 84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S O 4.14.117-perf-g8035d1a #1
[ 84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT)
[ 84.048550] Call trace:
[ 84.048564] dump_backtrace+0x0/0x268
[ 84.048569] show_stack+0x14/0x20
[ 84.048577] dump_stack+0xc4/0x100
[ 84.048584] panic+0x1f0/0x410
[ 84.048591] complete_and_exit+0x0/0x20
[ 84.048596] do_group_exit+0x8c/0xa0
[ 84.048602] get_signal+0x1c0/0x790
[ 84.048608] do_notify_resume+0x184/0xc30
[ 84.048613] work_pending+0x8/0x10

>From the kdump loaded by crash utility, all threads of global init have exited,
the group_dead value of global init has truned to 0 by atomic_dec_and_test().
crash> ps init
PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 1 0 2 ffffffcd77526000 ?? 0.0 0 0 init
534 1 4 ffffffcd6b9a9000 ZO 0.0 0 0 init
535 1 4 ffffffcd6b9aa000 ZO 0.0 0 0 init
crash> ps -g 1
PID: 1 TASK: ffffffcd77526000 CPU: 2 COMMAND: "init"
(no threads)
crash> struct task_struct.signal ffffffcd77526000
signal = 0xffffffcd77530000
crash> struct signal_struct 0xffffffcd77530000
struct signal_struct {
sigcnt = {
counter = 1
},
live = {
counter = 0
},
nr_threads = 1,
thread_head = {
next = 0xffffffcd77526730,
prev = 0xffffffcd77526730
},
group_exit_code = 11,
notify_count = 0,
group_exit_task = 0x0,
group_stop_count = 0,
flags = 4,
...
}

However, as Christian said, the test for this is tricky since we must
make sure all of init threads exited. I make a test for is_global_init()
and send a SIGSEGV signal to global init task in userspace. The phone
crash imeddiately and reboot to collect kdump. Then I extract the coredump
of global init task from kdump successfully.
(gdb) core-file core.1.init
Core was generated by `/system/bin/init second_stage'.
#0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
9 cmn x0, #(MAX_ERRNO + 1)
(gdb) bt
#0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
#1 0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11)
at system/core/init/reboot_utils.cpp:141
#2 0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138
#3 0x0000007f8de236a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

So from the following test, we have confidence that the following patch can help us
to get the extra coredump of init when global init task do real exit.

diff --git a/kernel/exit.c b/kernel/exit.c
index 8e288e8..9454106 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -476,10 +476,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
}

write_unlock_irq(&tasklist_lock);
- if (unlikely(pid_ns == &init_pid_ns)) {
- panic("Attempted to kill init! exitcode=0x%08x\n",
- father->signal->group_exit_code ?: father->exit_code);
- }

list_for_each_entry_safe(p, n, dead, ptrace_entry) {
list_del_init(&p->ptrace_entry);
@@ -687,6 +683,10 @@ void do_exit(long code)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");

+ group_dead = atomic_dec_and_test(&tsk->signal->live);
+ if (unlikely(is_global_init(tsk) && group_dead))
+ panic("Attempted to kill init! exitcode=0x%08lx\n", code);
+
/*
* If do_exit is called because this processes oopsed, it's possible
* that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
@@ -743,7 +743,6 @@ void do_exit(long code)
if (tsk->mm)
sync_mm_rss(tsk->mm);
acct_update_integrals(tsk);
- group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
hrtimer_cancel(&tsk->signal->real_timer);
exit_itimers(tsk->signal);