Re: WARNING in __mark_chain_precision

From: Hao Sun
Date: Fri Dec 30 2022 - 04:45:04 EST




Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> 于2022年12月30日周五 06:16写道:
>
> On Tue, Dec 27, 2022 at 9:24 PM Yonghong Song <yhs@xxxxxxxx> wrote:
>>
>>
>>
>> On 12/20/22 4:30 PM, Andrii Nakryiko wrote:
>>> On Mon, Dec 19, 2022 at 11:13 AM <sdf@xxxxxxxxxx> wrote:
>>>>
>>>> On 12/19, Hao Sun wrote:
>>>>> Hi,
>>>>
>>>>> The following backtracking bug can be triggered on the latest bpf-next and
>>>>> Linux 6.1 with the C prog provided. I don't have enough knowledge about
>>>>> this part in the verifier, don't know how to fix this.
>>>>
>>>> Maybe something related to commit be2ef8161572 ("bpf: allow precision
>>>> tracking
>>>> for programs with subprogs") and/or the related ones?
>>>>
>>>>
>>>>> This can be reproduced on:
>>>>
>>>>> HEAD commit: 0e43662e61f2 tools/resolve_btfids: Use pkg-config to locate
>>>>> libelf
>>>>> git tree: bpf-next
>>>>> console log: https://pastebin.com/raw/45hZ7iqm
>>>>> kernel config: https://pastebin.com/raw/0pu1CHRm
>>>>> C reproducer: https://pastebin.com/raw/tqsiezvT
>>>>
>>>>> func#0 @0
>>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>>> 0: (18) r2 = 0x8000000000000 ; R2_w=2251799813685248
>>>>> 2: (18) r6 = 0xffff888027358000 ;
>>>>> R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
>>>>> 4: (18) r7 = 0xffff88802735a000 ;
>>>>> R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
>>>>> 6: (18) r8 = 0xffff88802735e000 ;
>>>>> R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
>>>>> 8: (18) r9 = 0x8e9700000000 ; R9_w=156779191205888
>>>>> 10: (36) if w9 >= 0xffffffe3 goto pc+1
>>>>> last_idx 10 first_idx 0
>>>>> regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
>>>>> 11: R9_w=156779191205888
>>>>> 11: (85) call #0
>>>>> 12: (cc) w2 s>>= w7
>>>
>>> w2 should have been set to NOT_INIT (because r1-r5 are clobbered by
>>> calls) and rejected here as !read_ok (see check_reg_arg()) before
>>> attempting to mark precision for r2. Can you please try to debug and
>>> understand why that didn't happen here?
>>
>> The verifier is doing the right thing here and the 'call #0' does
>> implicitly cleared r1-r5.
>>
>> So for 'w2 s>>= w7', since w2 is used, the verifier tries to find
>> its definition by backtracing. It encountered 'call #0', which clears
>
> and that's what I'm saying is incorrect. Normally we'd get !read_ok
> error because s>>= is both READ and WRITE on w2, which is
> uninitialized after call instruction according to BPF ABI. And that's
> what actually seems to happen correctly in my (simpler) tests locally.
> But something is special about this specific repro that somehow either
> bypasses this logic, or attempts to mark precision before we get to
> that test. That's what we should investigate. I haven't tried to run
> this specific repro locally yet, so can't tell for sure.
>

So, the reason why w2 is not marked as uninit is that the kfunc call in
the BPF program is invalid, "call #0", imm is zero, right?
In check_kfunc_call(), it skips this error temporarily:

/* skip for now, but return error when we find this in fixup_kfunc_call */
if (!insn->imm)
return 0;

So the kfunc call is the previous instruction before "w2 s>>= w7", this
leads to the warning in backtrack_insn():

/* regular helper call sets R0 */
*reg_mask &= ~1;
if (*reg_mask & 0x3f) {
/* if backtracing was looking for registers R1-R5
* they should have been found already.
*/
verbose(env, "BUG regs %x\n", *reg_mask);
WARN_ONCE(1, "verifier backtracking bug”);
return -EFAULT;
}

Any idea or hint on how to fix this?