Re: [PATCH v1 7/8] tracing/perf: Add might_fault check to syscall probes

From: Steven Rostedt
Date: Thu Oct 03 2024 - 18:36:55 EST


On Thu, 3 Oct 2024 11:16:37 -0400
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:

> Add a might_fault() check to validate that the perf sys_enter/sys_exit
> probe callbacks are indeed called from a context where page faults can
> be handled.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: Michael Jeanson <mjeanson@xxxxxxxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> Cc: Yonghong Song <yhs@xxxxxx>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Cc: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> Cc: bpf@xxxxxxxxxxxxxxx
> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> ---
> include/trace/perf.h | 1 +
> kernel/trace/trace_syscalls.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/include/trace/perf.h b/include/trace/perf.h
> index 5650c1bad088..321bfd7919f6 100644
> --- a/include/trace/perf.h
> +++ b/include/trace/perf.h
> @@ -84,6 +84,7 @@ perf_trace_##call(void *__data, proto) \
> u64 __count __attribute__((unused)); \
> struct task_struct *__task __attribute__((unused)); \
> \
> + might_fault(); \
> guard(preempt_notrace)(); \
> do_perf_trace_##call(__data, args); \

Same for this. This is used for all tracepoints that perf hooks to.

-- Steve

> }
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 89d7e4c57b5b..0d42d6f293d6 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -602,6 +602,7 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
> * Syscall probe called with preemption enabled, but the ring
> * buffer and per-cpu data require preemption to be disabled.
> */
> + might_fault();
> guard(preempt_notrace)();
>
> syscall_nr = trace_get_syscall_nr(current, regs);
> @@ -710,6 +711,7 @@ static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret)
> * Syscall probe called with preemption enabled, but the ring
> * buffer and per-cpu data require preemption to be disabled.
> */
> + might_fault();
> guard(preempt_notrace)();
>
> syscall_nr = trace_get_syscall_nr(current, regs);