Re: [PATCH v2 bpf-next 2/4] bpf: introduce helper bpf_get_task_stak()
From: Andrii Nakryiko
Date: Fri Jun 26 2020 - 16:17:55 EST
On Thu, Jun 25, 2020 at 5:14 PM Song Liu <songliubraving@xxxxxx> wrote:
>
> Introduce helper bpf_get_task_stack(), which dumps stack trace of given
> task. This is different to bpf_get_stack(), which gets stack track of
> current task. One potential use case of bpf_get_task_stack() is to call
> it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.
>
> bpf_get_task_stack() uses stack_trace_save_tsk() instead of
> get_perf_callchain() for kernel stack. The benefit of this choice is that
> stack_trace_save_tsk() doesn't require changes in arch/. The downside of
> using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
> stack trace to unsigned long array. For 32-bit systems, we need to
> translate it to u64 array.
>
> Signed-off-by: Song Liu <songliubraving@xxxxxx>
> ---
Looks great, I just think that there are cases where user doesn't
necessarily has valid task_struct pointer, just pid, so would be nice
to not artificially restrict such cases by having extra helper.
Acked-by: Andrii Nakryiko <andriin@xxxxxx>
> include/linux/bpf.h | 1 +
> include/uapi/linux/bpf.h | 35 ++++++++++++++-
> kernel/bpf/stackmap.c | 79 ++++++++++++++++++++++++++++++++--
> kernel/trace/bpf_trace.c | 2 +
> scripts/bpf_helpers_doc.py | 2 +
> tools/include/uapi/linux/bpf.h | 35 ++++++++++++++-
> 6 files changed, 149 insertions(+), 5 deletions(-)
>
[...]
> + /* stack_trace_save_tsk() works on unsigned long array, while
> + * perf_callchain_entry uses u64 array. For 32-bit systems, it is
> + * necessary to fix this mismatch.
> + */
> + if (__BITS_PER_LONG != 64) {
> + unsigned long *from = (unsigned long *) entry->ip;
> + u64 *to = entry->ip;
> + int i;
> +
> + /* copy data from the end to avoid using extra buffer */
> + for (i = entry->nr - 1; i >= (int)init_nr; i--)
> + to[i] = (u64)(from[i]);
doing this forward would be just fine as well, no? First iteration
will cast and overwrite low 32-bits, all the subsequent iterations
won't even overlap.
> + }
> +
> +exit_put:
> + put_callchain_entry(rctx);
> +
> + return entry;
> +}
> +
[...]
> +BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf,
> + u32, size, u64, flags)
> +{
> + struct pt_regs *regs = task_pt_regs(task);
> +
> + return __bpf_get_stack(regs, task, buf, size, flags);
> +}
So this takes advantage of BTF and having a direct task_struct
pointer. But for kprobes/tracepoint I think it would also be extremely
helpful to be able to request stack trace by PID. How about one more
helper which will wrap this one with get/put task by PID, e.g.,
bpf_get_pid_stack(int pid, void *buf, u32 size, u64 flags)? Would that
be a problem?
> +
> +static int bpf_get_task_stack_btf_ids[5];
> +const struct bpf_func_proto bpf_get_task_stack_proto = {
> + .func = bpf_get_task_stack,
> + .gpl_only = false,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_BTF_ID,
> + .arg2_type = ARG_PTR_TO_UNINIT_MEM,
> + .arg3_type = ARG_CONST_SIZE_OR_ZERO,
> + .arg4_type = ARG_ANYTHING,
> + .btf_id = bpf_get_task_stack_btf_ids,
> +};
> +
[...]