[PATCH] tracing: Add new_exec tracepoint
From: Marco Elver
Date: Mon Apr 08 2024 - 05:03:17 EST
Add "new_exec" tracepoint, which is run right after the point of no
return but before the current task assumes its new exec identity.
Unlike the tracepoint "sched_process_exec", the "new_exec" tracepoint
runs before flushing the old exec, i.e. while the task still has the
original state (such as original MM), but when the new exec either
succeeds or crashes (but never returns to the original exec).
Being able to trace this event can be helpful in a number of use cases:
* allowing tracing eBPF programs access to the original MM on exec,
before current->mm is replaced;
* counting exec in the original task (via perf event);
* profiling flush time ("new_exec" to "sched_process_exec").
Example of tracing output ("new_exec" and "sched_process_exec"):
$ cat /sys/kernel/debug/tracing/trace_pipe
<...>-379 [003] ..... 179.626921: new_exec: filename=/usr/bin/sshd pid=379 comm=sshd
<...>-379 [003] ..... 179.629131: sched_process_exec: filename=/usr/bin/sshd pid=379 old_pid=379
<...>-381 [002] ..... 180.048580: new_exec: filename=/bin/bash pid=381 comm=sshd
<...>-381 [002] ..... 180.053122: sched_process_exec: filename=/bin/bash pid=381 old_pid=381
<...>-385 [001] ..... 180.068277: new_exec: filename=/usr/bin/tty pid=385 comm=bash
<...>-385 [001] ..... 180.069485: sched_process_exec: filename=/usr/bin/tty pid=385 old_pid=385
<...>-389 [006] ..... 192.020147: new_exec: filename=/usr/bin/dmesg pid=389 comm=bash
bash-389 [006] ..... 192.021377: sched_process_exec: filename=/usr/bin/dmesg pid=389 old_pid=389
Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
---
fs/exec.c | 2 ++
include/trace/events/task.h | 30 ++++++++++++++++++++++++++++++
2 files changed, 32 insertions(+)
diff --git a/fs/exec.c b/fs/exec.c
index 38bf71cbdf5e..ab778ae1fc06 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1268,6 +1268,8 @@ int begin_new_exec(struct linux_binprm * bprm)
if (retval)
return retval;
+ trace_new_exec(current, bprm);
+
/*
* Ensure all future errors are fatal.
*/
diff --git a/include/trace/events/task.h b/include/trace/events/task.h
index 47b527464d1a..8853dc44783d 100644
--- a/include/trace/events/task.h
+++ b/include/trace/events/task.h
@@ -56,6 +56,36 @@ TRACE_EVENT(task_rename,
__entry->newcomm, __entry->oom_score_adj)
);
+/**
+ * new_exec - called before setting up new exec
+ * @task: pointer to the current task
+ * @bprm: pointer to linux_binprm used for new exec
+ *
+ * Called before flushing the old exec, but at the point of no return during
+ * switching to the new exec.
+ */
+TRACE_EVENT(new_exec,
+
+ TP_PROTO(struct task_struct *task, struct linux_binprm *bprm),
+
+ TP_ARGS(task, bprm),
+
+ TP_STRUCT__entry(
+ __string( filename, bprm->filename )
+ __field( pid_t, pid )
+ __string( comm, task->comm )
+ ),
+
+ TP_fast_assign(
+ __assign_str(filename, bprm->filename);
+ __entry->pid = task->pid;
+ __assign_str(comm, task->comm);
+ ),
+
+ TP_printk("filename=%s pid=%d comm=%s",
+ __get_str(filename), __entry->pid, __get_str(comm))
+);
+
#endif
/* This part must be outside protection */
--
2.44.0.478.gd926399ef9-goog