Re: [PATCH bpf-next v2] bpf: Add bpf_check_attach_target_with_klog method to output failure logs to kernel
From: Andrii Nakryiko
Date: Thu Jul 25 2024 - 17:27:51 EST
On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
>
>
>
> On 25/7/24 14:09, Yonghong Song wrote:
> >
> > On 7/24/24 11:05 PM, Leon Hwang wrote:
> >>
> >> On 25/7/24 13:54, Yonghong Song wrote:
> >>> On 7/24/24 10:15 PM, Zheao Li wrote:
> >>>> This is a v2 patch, previous Link:
> >>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u
> >>>>
> >>>> Compare with v1:
> >>>>
> >>>> 1. Format the code style and signed-off field
> >>>> 2. Use a shorter name bpf_check_attach_target_with_klog instead of
> >>>> original name bpf_check_attach_target_with_kernel_log
> >>>>
> >>>> When attaching a freplace hook, failures can occur,
> >>>> but currently, no output is provided to help developers diagnose the
> >>>> root cause.
> >>>>
> >>>> This commit adds a new method, bpf_check_attach_target_with_klog,
> >>>> which outputs the verifier log to the kernel.
> >>>> Developers can then use dmesg to obtain more detailed information
> >>>> about the failure.
> >>>>
> >>>> For an example of eBPF code,
> >>>> Link:
> >>>> https://github.com/Asphaltt/learn-by-example/blob/main/ebpf/freplace/main.go
> >>>>
> >>>> Co-developed-by: Leon Hwang <hffilwlqm@xxxxxxxxx>
> >>>> Signed-off-by: Leon Hwang <hffilwlqm@xxxxxxxxx>
> >>>> Signed-off-by: Zheao Li <me@xxxxxxxxxxxx>
> >>>> ---
> >>>> include/linux/bpf_verifier.h | 5 +++++
> >>>> kernel/bpf/syscall.c | 5 +++--
> >>>> kernel/bpf/trampoline.c | 6 +++---
> >>>> kernel/bpf/verifier.c | 19 +++++++++++++++++++
> >>>> 4 files changed, 30 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/include/linux/bpf_verifier.h
> >>>> b/include/linux/bpf_verifier.h
> >>>> index 5cea15c81b8a..8eddba62c194 100644
> >>>> --- a/include/linux/bpf_verifier.h
> >>>> +++ b/include/linux/bpf_verifier.h
> >>>> @@ -848,6 +848,11 @@ static inline void bpf_trampoline_unpack_key(u64
> >>>> key, u32 *obj_id, u32 *btf_id)
> >>>> *btf_id = key & 0x7FFFFFFF;
> >>>> }
> >>>> +int bpf_check_attach_target_with_klog(const struct bpf_prog *prog,
> >>>> + const struct bpf_prog *tgt_prog,
> >>>> + u32 btf_id,
> >>>> + struct bpf_attach_target_info *tgt_info);
> >>> format issue in the above. Same code alignment is needed for arguments
> >>> in different lines.
> >>>
> >>>> +
> >>>> int bpf_check_attach_target(struct bpf_verifier_log *log,
> >>>> const struct bpf_prog *prog,
> >>>> const struct bpf_prog *tgt_prog,
> >>>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> >>>> index 869265852d51..bf826fcc8cf4 100644
> >>>> --- a/kernel/bpf/syscall.c
> >>>> +++ b/kernel/bpf/syscall.c
> >>>> @@ -3464,8 +3464,9 @@ static int bpf_tracing_prog_attach(struct
> >>>> bpf_prog *prog,
> >>>> */
> >>>> struct bpf_attach_target_info tgt_info = {};
> >>>> - err = bpf_check_attach_target(NULL, prog, tgt_prog, btf_id,
> >>>> - &tgt_info);
> >>>> + err = bpf_check_attach_target_with_klog(prog, NULL,
> >>>> + prog->aux->attach_btf_id,
> >>>> + &tgt_info);
> >>> code alignment issue here as well.
> >>> Also, the argument should be 'prog, tgt_prog, btf_id, &tgt_info', right?
> >>>
> >>>> if (err)
> >>>> goto out_unlock;
> >>>> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> >>>> index f8302a5ca400..8862adaa7302 100644
> >>>> --- a/kernel/bpf/trampoline.c
> >>>> +++ b/kernel/bpf/trampoline.c
> >>>> @@ -699,9 +699,9 @@ int bpf_trampoline_link_cgroup_shim(struct
> >>>> bpf_prog *prog,
> >>>> u64 key;
> >>>> int err;
> >>>> - err = bpf_check_attach_target(NULL, prog, NULL,
> >>>> - prog->aux->attach_btf_id,
> >>>> - &tgt_info);
> >>>> + err = bpf_check_attach_target_with_klog(prog, NULL,
> >>>> + prog->aux->attach_btf_id,
> >>>> + &tgt_info);
> >>> code alignment issue here
> >>>
> >>>> if (err)
> >>>> return err;
> >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >>>> index 1f5302fb0957..4873b72f5a9a 100644
> >>>> --- a/kernel/bpf/verifier.c
> >>>> +++ b/kernel/bpf/verifier.c
> >>>> @@ -21643,6 +21643,25 @@ static int
> >>>> check_non_sleepable_error_inject(u32 btf_id)
> >>>> return btf_id_set_contains(&btf_non_sleepable_error_inject,
> >>>> btf_id);
> >>>> }
> >>>> +int bpf_check_attach_target_with_klog(const struct bpf_prog *prog,
> >>>> + const struct bpf_prog *tgt_prog,
> >>>> + u32 btf_id,
> >>>> + struct bpf_attach_target_info *tgt_info);
> >>> code alignment issue here.
> >>>
> >>>> +{
> >>>> + struct bpf_verifier_log *log;
> >>>> + int err;
> >>>> +
> >>>> + log = kzalloc(sizeof(*log), GFP_KERNEL | __GFP_NOWARN);
> >>> __GFP_NOWARN is unnecessary here.
> >>>
> >>>> + if (!log) {
> >>>> + err = -ENOMEM;
> >>>> + return err;
> >>>> + }
> >>>> + log->level = BPF_LOG_KERNEL;
> >>>> + err = bpf_check_attach_target(log, prog, tgt_prog, btf_id,
> >>>> tgt_info);
> >>>> + kfree(log);
> >>>> + return err;
> >>>> +}
> >>>> +
> >>>> int bpf_check_attach_target(struct bpf_verifier_log *log,
> >>>> const struct bpf_prog *prog,
> >>>> const struct bpf_prog *tgt_prog,
> >>> More importantly, Andrii has implemented retsnoop, which intends to
> >>> locate
> >>> precise location in the kernel where err happens. The link is
> >>> https://github.com/anakryiko/retsnoop
> >>>
> >>> Maybe you want to take a look and see whether it can resolve your issue.
> >>> We should really avoid putting more stuff in dmesg whenever possible.
> >>>
> >> retsnoop is really cool.
> >>
> >> However, when something wrong in bpf_check_attach_target(), retsnoop
> >> only gets its return value -EINVAL, without any bpf_log() in it. It's
> >> hard to figure out the reason why bpf_check_attach_target() returns
> >> -EINVAL.
> >
> > It should have line number like below in
> > https://github.com/anakryiko/retsnoop
> >
> > |$ sudo ./retsnoop -e '*sys_bpf' -a ':kernel/bpf/*.c' Receiving data...
> > 20:19:36.372607 -> 20:19:36.372682 TID/PID 8346/8346 (simfail/simfail):
> > entry_SYSCALL_64_after_hwframe+0x63 (arch/x86/entry/entry_64.S:120:0)
> > do_syscall_64+0x35 (arch/x86/entry/common.c:80:7) . do_syscall_x64
> > (arch/x86/entry/common.c:50:12) 73us [-ENOMEM] __x64_sys_bpf+0x1a
> > (kernel/bpf/syscall.c:5067:1) 70us [-ENOMEM] __sys_bpf+0x38b
> > (kernel/bpf/syscall.c:4947:9) . map_create (kernel/bpf/syscall.c:1106:8)
> > . find_and_alloc_map (kernel/bpf/syscall.c:132:5) ! 50us [-ENOMEM]
> > array_map_alloc !* 2us [NULL] bpf_map_alloc_percpu Could you double
> > check? It does need corresponding kernel source though. |
> >
>
> I have a try on an Ubuntu 24.04 VM, whose kernel is 6.8.0-39-generic.
>
> $ sudo retsnoop -e '*sys_bpf' -a ':kernel/bpf/*.c' -T
> Receiving data...
> 07:18:38.643643 -> 07:18:38.643728 TID/PID 6042/6039 (freplace/freplace):
>
> FUNCTION CALL TRACE RESULT
> DURATION
> ---------------------------------------------------
> -------------------- --------
> → __x64_sys_bpf
>
> → __sys_bpf
>
> ↔ bpf_check_uarg_tail_zero [0]
> 2.376us
> → link_create
>
> ↔ __bpf_prog_get
> [0xffffb55f40db3000] 2.796us
> ↔ bpf_prog_attach_check_attach_type [0]
> 2.260us
> → bpf_tracing_prog_attach
>
> ↔ __bpf_prog_get
> [0xffffb55f40d71000] 9.455us
> → bpf_check_attach_target
>
> → btf_check_type_match
>
> → btf_check_func_type_match
>
> ↔ bpf_log [void]
> 2.578us
> ← btf_check_func_type_match [-EINVAL]
> 7.659us
> ← btf_check_type_match [-EINVAL]
> 15.950us
> ← bpf_check_attach_target [-EINVAL]
> 22.397us
> ↔ __bpf_prog_put [void]
> 2.323us
> ← bpf_tracing_prog_attach [-EINVAL]
> 45.509us
> ↔ __bpf_prog_put [void]
> 2.182us
> ← link_create [-EINVAL]
> 66.445us
> ← __sys_bpf [-EINVAL]
> 77.347us
> ← __x64_sys_bpf [-EINVAL]
> 81.979us
>
> entry_SYSCALL_64_after_hwframe+0x78
> (arch/x86/entry/entry_64.S:130:0)
> do_syscall_64+0x7f
> (arch/x86/entry/common.c:83:7)
> . do_syscall_x64
> (arch/x86/entry/common.c:52:12)
> x64_sys_call+0x1936
> (arch/x86/entry/syscall_64.c:33:1)
> 81us [-EINVAL] __x64_sys_bpf+0x1a
> (kernel/bpf/syscall.c:5588:1)
> 77us [-EINVAL] __sys_bpf+0x4ae
> (kernel/bpf/syscall.c:5556:9)
> ! 66us [-EINVAL] link_create
>
> !* 45us [-EINVAL] bpf_tracing_prog_attach
>
> !* 22us [-EINVAL] bpf_check_attach_target
>
> !* 15us [-EINVAL] btf_check_type_match
>
> !* 7us [-EINVAL] btf_check_func_type_match
>
> P.S. Check
> https://gist.github.com/Asphaltt/883fd7362968f7747e820d63a9519971 to
> have a better view of this output.
>
> When attach freplace prog to a static-noline subprog, there is a
> bpf_log() in btf_check_func_type_match(). However, I don't know what
> bpf_log() logs.
If you build the very latest retsnoop from Github (this functionality
hasn't been released just yet, I added it literally two days ago), you
will be able to capture bpf_log's format string. vararg arguments
themselves are not (yet) captured, but I'm going to play with that.
Try something like this:
sudo ./retsnoop -e verbose -e bpf_log -e bpf_verifier_vlog -e
bpf_verifier_log_write -STA -v
And you might see something like:
FUNCTION CALLS RESULT DURATION ARGS
-------------- ------ -------- ----
↔ bpf_log [void] 2.555us log=&{} fmt='func '%s' arg%d has
btf_id %d type %s '%s' ' =(vararg)
or
FUNCTION CALLS RESULT DURATION ARGS
-------------- ------ -------- ----
↔ bpf_log [void] 5.729us log=&{} fmt='arg#%d reference
type('%s %s') size cannot be determined: %ld ' =(vararg)
So you'll get a general understanding from format string (but yeah,
actual arguments would be good to have).
This is not really a solution, but definitely useful for debugging.
Is there some simple way for me to reproduce your specific issue, I'd
like to use that as one motivating example to see how far retsnoop can
be pushed.
P.S. I do think that putting any logging like this into dmesg is
definitely wrong, btw.
>
> With this patch, we are able to figure out what bpf_log() logs.
> Therefore, we are able to figure out the reason why it fails to attach
> freplace prog.
>
> Thanks,
> Leon
>