Re: [RFC PATCH bpf-next] ksnoop: kernel argument/return value tracing/display using BTF

From: Alexei Starovoitov
Date: Mon Jan 04 2021 - 20:58:09 EST


On Mon, Jan 04, 2021 at 03:26:31PM +0000, Alan Maguire wrote:
>
> ksnoop can be used to show function signatures; for example:
>
> $ ksnoop info ip_send_skb
> int ip_send_skb(struct net * net, struct sk_buff * skb);
>
> Then we can trace the function, for example:
>
> $ ksnoop trace ip_send_skb

Thanks for sharing. It will be useful tool.

> +
> + data = get_arg(ctx, currtrace->base_arg);
> +
> + dataptr = (void *)data;
> +
> + if (currtrace->offset)
> + dataptr += currtrace->offset;
> +
> + /* look up member value and read into data field, provided
> + * it <= size of a __u64; when it is, it can be used in
> + * predicate evaluation.
> + */
> + if (currtrace->flags & KSNOOP_F_MEMBER) {
> + ret = -EINVAL;
> + data = 0;
> + if (currtrace->size <= sizeof(__u64))
> + ret = bpf_probe_read_kernel(&data,
> + currtrace->size,
> + dataptr);
> + else
> + bpf_printk("size was %d cant trace",
> + currtrace->size);
> + if (ret) {
> + currdata->err_type_id =
> + currtrace->type_id;
> + currdata->err = ret;
> + continue;
> + }
> + if (currtrace->flags & KSNOOP_F_PTR)
> + dataptr = (void *)data;
> + }
> +
> + /* simple predicate evaluation: if any predicate fails,
> + * skip all tracing for this function.
> + */
> + if (currtrace->flags & KSNOOP_F_PREDICATE_MASK) {
> + bool ok = false;
> +
> + if (currtrace->flags & KSNOOP_F_PREDICATE_EQ &&
> + data == currtrace->predicate_value)
> + ok = true;
> +
> + if (currtrace->flags & KSNOOP_F_PREDICATE_NOTEQ &&
> + data != currtrace->predicate_value)
> + ok = true;
> +
> + if (currtrace->flags & KSNOOP_F_PREDICATE_GT &&
> + data > currtrace->predicate_value)
> + ok = true;
> + if (currtrace->flags & KSNOOP_F_PREDICATE_LT &&
> + data < currtrace->predicate_value)
> + ok = true;
> +
> + if (!ok)
> + goto skiptrace;
> + }
> +
> + currdata->raw_value = data;
> +
> + if (currtrace->flags & (KSNOOP_F_PTR | KSNOOP_F_MEMBER))
> + btf_ptr.ptr = dataptr;
> + else
> + btf_ptr.ptr = &data;
> +
> + btf_ptr.type_id = currtrace->type_id;
> +
> + if (trace->buf_len + MAX_TRACE_DATA >= MAX_TRACE_BUF)
> + break;
> +
> + buf_offset = &trace->buf[trace->buf_len];
> + if (buf_offset > &trace->buf[MAX_TRACE_BUF]) {
> + currdata->err_type_id = currtrace->type_id;
> + currdata->err = -ENOSPC;
> + continue;
> + }
> + currdata->buf_offset = trace->buf_len;
> +
> + ret = bpf_snprintf_btf(buf_offset,
> + MAX_TRACE_DATA,
> + &btf_ptr, sizeof(btf_ptr),
> + BTF_F_PTR_RAW);

The overhead would be much lower if instead of printing in the kernel the
tool's bpf prog would dump the struct data into ring buffer and let the user
space part of the tool do the pretty printing. There would be no need to pass
btf_id from the user space to the kernel either. The user space would need to
gain pretty printing logic, but may be we can share the code somehow between
the kernel and libbpf.

Separately the interpreter in the bpf prog to handle predicates is kinda
anti-bpf :) I think ksnoop can generate bpf code on the fly instead. No need
for llvm. The currtrace->offset/size would be written into the prog placeholder
instructions by ksnoop before loading the prog. With much improved overhead for
filtering.