Re: [PATCH v2] exit: add exit_code to trace_sched_process_exit and move it earlier in do_exit()

From: Oleg Nesterov
Date: Mon Feb 26 2024 - 14:46:11 EST


Well. since I have already participated in the previous discussions...

As I said, I can't ack this (user-visible) patch even if I tried to
suggest this from the very beginning, I leave it to the maintainers.

I see nothing wrong in this change, but let me ask: do we really need
to report the exit code? this makes this patch even more user-visible
and I have no idea if it can break the current users.

On 02/23, wenyang.linux@xxxxxxxxxxx wrote:
>
> From: Wen Yang <wenyang.linux@xxxxxxxxxxx>
>
> Currently coredump_task_exit() takes some time to wait for the generation
> of the dump file. But if the user-space wants to receive a notification
> as soon as possible it maybe inconvenient.
>
> Add exit_code to the TP trace_sched_process_exit() and move it earlier in
> do_exit(). This way a user-space monitor could detect the exits and
> potentially make some preparations in advance.
>
> Suggested-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Signed-off-by: Wen Yang <wenyang.linux@xxxxxxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> ---
> include/trace/events/sched.h | 28 +++++++++++++++++++++++++---
> kernel/exit.c | 2 +-
> 2 files changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index dbb01b4b7451..c2e8655fd453 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -330,9 +330,31 @@ DEFINE_EVENT(sched_process_template, sched_process_free,
> /*
> * Tracepoint for a task exiting:
> */
> -DEFINE_EVENT(sched_process_template, sched_process_exit,
> - TP_PROTO(struct task_struct *p),
> - TP_ARGS(p));
> +TRACE_EVENT(sched_process_exit,
> +
> + TP_PROTO(struct task_struct *task, long code),
> +
> + TP_ARGS(task, code),
> +
> + TP_STRUCT__entry(
> + __array( char, comm, TASK_COMM_LEN )
> + __field( pid_t, pid )
> + __field( int, prio )
> + __field( long, code )
> + ),
> +
> + TP_fast_assign(
> + memcpy(__entry->comm, task->comm, TASK_COMM_LEN);
> + __entry->pid = task->pid;
> + __entry->prio = task->prio;
> + __entry->code = code;
> + ),
> +
> + TP_printk("comm=%s pid=%d prio=%d exit_code=0x%lx",
> + __entry->comm, __entry->pid, __entry->prio,
> + __entry->code)
> +);
> +
>
> /*
> * Tracepoint for waiting on task to unschedule:
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 493647fd7c07..48b6ed7f7760 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -826,6 +826,7 @@ void __noreturn do_exit(long code)
>
> WARN_ON(tsk->plug);
>
> + trace_sched_process_exit(tsk, code);
> kcov_task_exit(tsk);
> kmsan_task_exit(tsk);
>
> @@ -866,7 +867,6 @@ void __noreturn do_exit(long code)
>
> if (group_dead)
> acct_process();
> - trace_sched_process_exit(tsk);
>
> exit_sem(tsk);
> exit_shm(tsk);
> --
> 2.25.1
>