Re: [PATCH bpf-next v2] bpf: Add bpf_check_attach_target_with_klog method to output failure logs to kernel

From: Yonghong Song
Date: Thu Jul 25 2024 - 01:55:21 EST



On 7/24/24 10:15 PM, Zheao Li wrote:
This is a v2 patch, previous Link: https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u

Compare with v1:

1. Format the code style and signed-off field
2. Use a shorter name bpf_check_attach_target_with_klog instead of
original name bpf_check_attach_target_with_kernel_log

When attaching a freplace hook, failures can occur,
but currently, no output is provided to help developers diagnose the root cause.

This commit adds a new method, bpf_check_attach_target_with_klog,
which outputs the verifier log to the kernel.
Developers can then use dmesg to obtain more detailed information about the failure.

For an example of eBPF code,
Link: https://github.com/Asphaltt/learn-by-example/blob/main/ebpf/freplace/main.go

Co-developed-by: Leon Hwang <hffilwlqm@xxxxxxxxx>
Signed-off-by: Leon Hwang <hffilwlqm@xxxxxxxxx>
Signed-off-by: Zheao Li <me@xxxxxxxxxxxx>
---
include/linux/bpf_verifier.h | 5 +++++
kernel/bpf/syscall.c | 5 +++--
kernel/bpf/trampoline.c | 6 +++---
kernel/bpf/verifier.c | 19 +++++++++++++++++++
4 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 5cea15c81b8a..8eddba62c194 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -848,6 +848,11 @@ static inline void bpf_trampoline_unpack_key(u64 key, u32 *obj_id, u32 *btf_id)
*btf_id = key & 0x7FFFFFFF;
}
+int bpf_check_attach_target_with_klog(const struct bpf_prog *prog,
+ const struct bpf_prog *tgt_prog,
+ u32 btf_id,
+ struct bpf_attach_target_info *tgt_info);

format issue in the above. Same code alignment is needed for arguments in different lines.

+
int bpf_check_attach_target(struct bpf_verifier_log *log,
const struct bpf_prog *prog,
const struct bpf_prog *tgt_prog,
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 869265852d51..bf826fcc8cf4 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3464,8 +3464,9 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
*/
struct bpf_attach_target_info tgt_info = {};
- err = bpf_check_attach_target(NULL, prog, tgt_prog, btf_id,
- &tgt_info);
+ err = bpf_check_attach_target_with_klog(prog, NULL,
+ prog->aux->attach_btf_id,
+ &tgt_info);

code alignment issue here as well.
Also, the argument should be 'prog, tgt_prog, btf_id, &tgt_info', right?

if (err)
goto out_unlock;
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index f8302a5ca400..8862adaa7302 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -699,9 +699,9 @@ int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
u64 key;
int err;
- err = bpf_check_attach_target(NULL, prog, NULL,
- prog->aux->attach_btf_id,
- &tgt_info);
+ err = bpf_check_attach_target_with_klog(prog, NULL,
+ prog->aux->attach_btf_id,
+ &tgt_info);

code alignment issue here

if (err)
return err;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1f5302fb0957..4873b72f5a9a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21643,6 +21643,25 @@ static int check_non_sleepable_error_inject(u32 btf_id)
return btf_id_set_contains(&btf_non_sleepable_error_inject, btf_id);
}
+int bpf_check_attach_target_with_klog(const struct bpf_prog *prog,
+ const struct bpf_prog *tgt_prog,
+ u32 btf_id,
+ struct bpf_attach_target_info *tgt_info);

code alignment issue here.

+{
+ struct bpf_verifier_log *log;
+ int err;
+
+ log = kzalloc(sizeof(*log), GFP_KERNEL | __GFP_NOWARN);

__GFP_NOWARN is unnecessary here.

+ if (!log) {
+ err = -ENOMEM;
+ return err;
+ }
+ log->level = BPF_LOG_KERNEL;
+ err = bpf_check_attach_target(log, prog, tgt_prog, btf_id, tgt_info);
+ kfree(log);
+ return err;
+}
+
int bpf_check_attach_target(struct bpf_verifier_log *log,
const struct bpf_prog *prog,
const struct bpf_prog *tgt_prog,

More importantly, Andrii has implemented retsnoop, which intends to locate
precise location in the kernel where err happens. The link is
https://github.com/anakryiko/retsnoop

Maybe you want to take a look and see whether it can resolve your issue.
We should really avoid putting more stuff in dmesg whenever possible.