Re: WARNING in __mark_chain_precision
From: Hao Sun
Date: Mon Jan 02 2023 - 04:42:44 EST
Yonghong Song <yhs@xxxxxxxx> 于2023年1月2日周一 03:20写道:
>
>
>
> On 12/30/22 1:44 AM, Hao Sun wrote:
>>
>>
>> Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> 于2022年12月30日周五 06:16写道:
>>>
>>> On Tue, Dec 27, 2022 at 9:24 PM Yonghong Song <yhs@xxxxxxxx> wrote:
>>>>
>>>>
>>>>
>>>> On 12/20/22 4:30 PM, Andrii Nakryiko wrote:
>>>>> On Mon, Dec 19, 2022 at 11:13 AM <sdf@xxxxxxxxxx> wrote:
>>>>>>
>>>>>> On 12/19, Hao Sun wrote:
>>>>>>> Hi,
>>>>>>
>>>>>>> The following backtracking bug can be triggered on the latest bpf-next and
>>>>>>> Linux 6.1 with the C prog provided. I don't have enough knowledge about
>>>>>>> this part in the verifier, don't know how to fix this.
>>>>>>
>>>>>> Maybe something related to commit be2ef8161572 ("bpf: allow precision
>>>>>> tracking
>>>>>> for programs with subprogs") and/or the related ones?
>>>>>>
>>>>>>
>>>>>>> This can be reproduced on:
>>>>>>
>>>>>>> HEAD commit: 0e43662e61f2 tools/resolve_btfids: Use pkg-config to locate
>>>>>>> libelf
>>>>>>> git tree: bpf-next
>>>>>>> console log: https://pastebin.com/raw/45hZ7iqm
>>>>>>> kernel config: https://pastebin.com/raw/0pu1CHRm
>>>>>>> C reproducer: https://pastebin.com/raw/tqsiezvT
>>>>>>
>>>>>>> func#0 @0
>>>>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>>>>> 0: (18) r2 = 0x8000000000000 ; R2_w=2251799813685248
>>>>>>> 2: (18) r6 = 0xffff888027358000 ;
>>>>>>> R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
>>>>>>> 4: (18) r7 = 0xffff88802735a000 ;
>>>>>>> R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
>>>>>>> 6: (18) r8 = 0xffff88802735e000 ;
>>>>>>> R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
>>>>>>> 8: (18) r9 = 0x8e9700000000 ; R9_w=156779191205888
>>>>>>> 10: (36) if w9 >= 0xffffffe3 goto pc+1
>>>>>>> last_idx 10 first_idx 0
>>>>>>> regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
>>>>>>> 11: R9_w=156779191205888
>>>>>>> 11: (85) call #0
>>>>>>> 12: (cc) w2 s>>= w7
>>>>>
>>>>> w2 should have been set to NOT_INIT (because r1-r5 are clobbered by
>>>>> calls) and rejected here as !read_ok (see check_reg_arg()) before
>>>>> attempting to mark precision for r2. Can you please try to debug and
>>>>> understand why that didn't happen here?
>>>>
>>>> The verifier is doing the right thing here and the 'call #0' does
>>>> implicitly cleared r1-r5.
>>>>
>>>> So for 'w2 s>>= w7', since w2 is used, the verifier tries to find
>>>> its definition by backtracing. It encountered 'call #0', which clears
>>>
>>> and that's what I'm saying is incorrect. Normally we'd get !read_ok
>>> error because s>>= is both READ and WRITE on w2, which is
>>> uninitialized after call instruction according to BPF ABI. And that's
>>> what actually seems to happen correctly in my (simpler) tests locally.
>>> But something is special about this specific repro that somehow either
>>> bypasses this logic, or attempts to mark precision before we get to
>>> that test. That's what we should investigate. I haven't tried to run
>>> this specific repro locally yet, so can't tell for sure.
>>>
>>
>> So, the reason why w2 is not marked as uninit is that the kfunc call in
>> the BPF program is invalid, "call #0", imm is zero, right?
>
> Yes, "call #0" is invalid. As the code below
>
>> /* skip for now, but return error when we find this in
> fixup_kfunc_call */
>> if (!insn->imm)
>> return 0;
>
> The error report will be delayed later in fixup_kfunc_call().
>
> static int fixup_kfunc_call(struct bpf_verifier_env *env, struct
> bpf_insn *insn,
> struct bpf_insn *insn_buf, int insn_idx,
> int *cnt)
> {
> const struct bpf_kfunc_desc *desc;
>
> if (!insn->imm) {
> verbose(env, "invalid kernel function call not
> eliminated in verifier pass\n");
> return -EINVAL;
> }
>
>
>> In check_kfunc_call(), it skips this error temporarily:
>>
>> /* skip for now, but return error when we find this in fixup_kfunc_call */
>> if (!insn->imm)
>> return 0;
>>
>> So the kfunc call is the previous instruction before "w2 s>>= w7", this
>> leads to the warning in backtrack_insn():
>>
>> /* regular helper call sets R0 */
>> *reg_mask &= ~1;
>> if (*reg_mask & 0x3f) {
>> /* if backtracing was looking for registers R1-R5
>> * they should have been found already.
>> */
>> verbose(env, "BUG regs %x\n", *reg_mask);
>> WARN_ONCE(1, "verifier backtracking bug”);
>> return -EFAULT;
>> }
>
> The main triggering the backtrack_insn() is due to
>
> } else {
> /* scalar += pointer
> * This is legal, but we have to
> reverse our
> * src/dest handling in computing the range
> */
> err = mark_chain_precision(env,
> insn->dst_reg);
> if (err)
> return err;
> return adjust_ptr_min_max_vals(env, insn,
> src_reg,
> dst_reg);
> }
>
>
> unc#0 @0
> 0: R1=ctx(off=0,imm=0) R10=fp0
> 0: (18) r2 = 0x8000000000000 ; R2_w=2251799813685248
> 2: (18) r6 = 0xffff888100d29000 ;
> R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
> 4: (18) r7 = 0xffff888100d2a000 ;
> R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
> 6: (18) r8 = 0xffff888100d2ac00 ;
> R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
> 8: (18) r9 = 0x8e9700000000 ; R9_w=156779191205888
> 10: (36) if w9 >= 0xffffffe3 goto pc+1
> last_idx 10 first_idx 0
> regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
> 11: R9_w=156779191205888
> 11: (85) call #0
> 12: (cc) w2 s>>= w7
> last_idx 12 first_idx 12
> parent didn't have regs=4 stack=0 marks: R1=ctx(off=0,imm=0)
> R2_rw=P2251799813685248 R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
> R7_rw=map_ptr(off=0,ks=156,vs=2624,imm=0) R8_w=map_ptr(off=0,ks=2396,v0
> last_idx 11 first_idx 0
> regs=4 stack=0 before 11: (85) call #0
> BUG regs 4
>
> For insn 12, 'w2 s>>= w7', w2 is a scalar and w7 is a map_ptr. Hence,
> based on the above verifier code, mark_chain_precision() is triggered.
>
> Not sure what is the purpose of this test. But to make it succeed,
> first "call #0" need to change to a valid kfunc call, and second, you
> might want to change 'w2 s>>= w7' to e.g., 'w9 s>>= w7' to avoid
> precision tracking.
>
The purpose is not to make the test "succeed", the verifier temporarily
skips the invalid kfunc insn "call #0", but this insn triggered a warning
in backtrack_insn(), while it is supposed to reject the program either
due to insn#12 32bit ptr alu or insn#11 invalid kfunc.
Maybe something like the bellow, after applying the patch, the reproducer
is rejected:
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (18) r2 = 0x8000000000000 ; R2_w=2251799813685248
2: (18) r6 = 0xffff88817d563000 ; R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
4: (18) r7 = 0xffff888171ee9000 ; R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
6: (18) r8 = 0xffff888171ee8000 ; R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
8: (18) r9 = 0x8e9700000000 ; R9_w=156779191205888
10: (36) if w9 >= 0xffffffe3 goto pc+1
last_idx 10 first_idx 0
regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
11: R9_w=156779191205888
11: (85) call #0
12: (cc) w2 s>>= w7
last_idx 12 first_idx 12
parent didn't have regs=4 stack=0 marks: R1=ctx(off=0,imm=0) R2_rw=P2251799813685248 R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0) R7_rw=map_ptr(off=0,ks=156,vs=2624,imm=0) R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0) R9_w=156779191205888 R10=fp0
last_idx 11 first_idx 0
regs=4 stack=0 before 11: (85) call #0
regs=4 stack=0 before 10: (36) if w9 >= 0xffffffe3 goto pc+1
regs=4 stack=0 before 8: (18) r9 = 0x8e9700000000
regs=4 stack=0 before 6: (18) r8 = 0xffff888171ee8000
regs=4 stack=0 before 4: (18) r7 = 0xffff888171ee9000
regs=4 stack=0 before 2: (18) r6 = 0xffff88817d563000
regs=4 stack=0 before 0: (18) r2 = 0x8000000000000
R2 32-bit pointer arithmetic prohibited
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4a25375ebb0d..abc7e96d826f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2743,6 +2743,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
*reg_mask |= sreg;
} else if (class == BPF_JMP || class == BPF_JMP32) {
if (opcode == BPF_CALL) {
+ /* skip for now, should return error when we find this in fixup_kfunc_call */
+ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && insn->imm == 0)
+ return 0;
if (insn->src_reg == BPF_PSEUDO_CALL)
return -ENOTSUPP;
/* BPF helpers that invoke callback subprogs are
>>
>> Any idea or hint on how to fix this?
>>