Re: [PATCH bpf-next] Detect jumping to reserved code during check_cfg()

From: Hao Sun
Date: Thu Oct 12 2023 - 02:23:18 EST


On Wed, Oct 11, 2023 at 4:50 PM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> On 10/11/23 8:46 AM, Hao Sun wrote:
> > On Wed, Oct 11, 2023 at 4:42 AM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> >> On Tue, Oct 10, 2023 at 1:33 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
> >>> On 10/10/23 9:02 AM, John Fastabend wrote:
> >>>> Hao Sun wrote:
> >>>>> Currently, we don't check if the branch-taken of a jump is reserved code of
> >>>>> ld_imm64. Instead, such a issue is captured in check_ld_imm(). The verifier
> >>>>> gives the following log in such case:
> >>>>>
> >>>>> func#0 @0
> >>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
> >>>>> 0: (18) r4 = 0xffff888103436000 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0)
> >>>>> 2: (18) r1 = 0x1d ; R1_w=29
> >>>>> 4: (55) if r4 != 0x0 goto pc+4 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0)
> >>>>> 5: (1c) w1 -= w1 ; R1_w=0
> >>>>> 6: (18) r5 = 0x32 ; R5_w=50
> >>>>> 8: (56) if w5 != 0xfffffff4 goto pc-2
> >>>>> mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1
> >>>>> mark_precise: frame0: regs=r5 stack= before 6: (18) r5 = 0x32
> >>>>> 7: R5_w=50
> >>>>> 7: BUG_ld_00
> >>>>> invalid BPF_LD_IMM insn
> >>>>>
> >>>>> Here the verifier rejects the program because it thinks insn at 7 is an
> >>>>> invalid BPF_LD_IMM, but such a error log is not accurate since the issue
> >>>>> is jumping to reserved code not because the program contains invalid insn.
> >>>>> Therefore, make the verifier check the jump target during check_cfg(). For
> >>>>> the same program, the verifier reports the following log:
> >>>>
> >>>> I think we at least would want a test case for this. Also how did you create
> >>>> this case? Is it just something you did manually and noticed a strange error?
> >>>
> >>> Curious as well.
> >>>
> >>> We do have test cases which try to jump into the middle of a double insn as can
> >>> be seen that this patch breaks BPF CI with regards to log mismatch below (which
> >>> still needs to be adapted, too). Either way, it probably doesn't hurt to also add
> >>> the above snippet as a test.
> >>>
> >>> Hao, as I understand, the patch here is an usability improvement (not a fix per se)
> >>> where we reject such cases earlier during cfg check rather than at a later point
> >>> where we validate ld_imm instruction. Or are there cases you found which were not
> >>> yet captured via current check_ld_imm()?
> >>>
> >>> test_verifier failure log :
> >>>
> >>> #458/u test1 ld_imm64 FAIL
> >>> Unexpected verifier log!
> >>> EXP: R1 pointer comparison
> >>> RES:
> >>> FAIL
> >>> Unexpected error message!
> >>> EXP: R1 pointer comparison
> >>> RES: jump to reserved code from insn 0 to 2
> >>> verification time 22 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>>
> >>> jump to reserved code from insn 0 to 2
> >>> verification time 22 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>> #458/p test1 ld_imm64 FAIL
> >>> Unexpected verifier log!
> >>> EXP: invalid BPF_LD_IMM insn
> >>> RES:
> >>> FAIL
> >>> Unexpected error message!
> >>> EXP: invalid BPF_LD_IMM insn
> >>> RES: jump to reserved code from insn 0 to 2
> >>> verification time 9 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>>
> >>> jump to reserved code from insn 0 to 2
> >>> verification time 9 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>> #459/u test2 ld_imm64 FAIL
> >>> Unexpected verifier log!
> >>> EXP: R1 pointer comparison
> >>> RES:
> >>> FAIL
> >>> Unexpected error message!
> >>> EXP: R1 pointer comparison
> >>> RES: jump to reserved code from insn 0 to 2
> >>> verification time 11 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>>
> >>> jump to reserved code from insn 0 to 2
> >>> verification time 11 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>> #459/p test2 ld_imm64 FAIL
> >>> Unexpected verifier log!
> >>> EXP: invalid BPF_LD_IMM insn
> >>> RES:
> >>> FAIL
> >>> Unexpected error message!
> >>> EXP: invalid BPF_LD_IMM insn
> >>> RES: jump to reserved code from insn 0 to 2
> >>> verification time 8 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>>
> >>> jump to reserved code from insn 0 to 2
> >>> verification time 8 usec
> >>> stack depth 0
> >>> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >>> #460/u test3 ld_imm64 OK
> >>>
> >>>>> func#0 @0
> >>>>> jump to reserved code from insn 8 to 7
> >>>>>
> >>>>> Signed-off-by: Hao Sun <sunhao.th@xxxxxxxxx>
> >>>
> >>> nit: This needs to be before the "---" line.
> >>>
> >>>>> ---
> >>>>> kernel/bpf/verifier.c | 7 +++++++
> >>>>> 1 file changed, 7 insertions(+)
> >>>>>
> >>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >>>>> index eed7350e15f4..725ac0b464cf 100644
> >>>>> --- a/kernel/bpf/verifier.c
> >>>>> +++ b/kernel/bpf/verifier.c
> >>>>> @@ -14980,6 +14980,7 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
> >>>>> {
> >>>>> int *insn_stack = env->cfg.insn_stack;
> >>>>> int *insn_state = env->cfg.insn_state;
> >>>>> + struct bpf_insn *insns = env->prog->insnsi;
> >>>>>
> >>>>> if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
> >>>>> return DONE_EXPLORING;
> >>>>> @@ -14993,6 +14994,12 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
> >>>>> return -EINVAL;
> >>>>> }
> >>>>>
> >>>>> + if (e == BRANCH && insns[w].code == 0) {
> >>>>> + verbose_linfo(env, t, "%d", t);
> >>>>> + verbose(env, "jump to reserved code from insn %d to %d\n", t, w);
> >>>>> + return -EINVAL;
> >>>>> + }
> >>>
> >>> Other than that, lgtm.
> >>
> >> We do rely quite a lot on verifier not complaining eagerly about some
> >> potentially invalid instructions if it's provable that some portion of
> >> the code won't ever be reached (think using .rodata variables for
> >> feature gating, poisoning intructions due to failed CO-RE relocation,
> >> which libbpf does actively, except it's using a call to non-existing
> >> helper). As such, check_cfg() is a wrong place to do such validity
> >> checks because some of the branches might never be run and validated
> >> in practice.
> >
> > Don't really agree. Jump to the middle of ld_imm64 is just like jumping
> > out of bounds, both break the CFG integrity immediately. For those
> > apparently incorrect jumps, rejecting early makes everything simple;
> > otherwise, we probably need some rewrite in the end.
>
> Could you elaborate on the 'breaking CFG integrity immediately'? This was
> what I was trying to gather earlier with log improvement vs actual fix.
>
> Do you mean /potentially/ breaking CFG integrity, if, say, we had a double
> insn jump in future and there is a back-jump to the 2nd part of the insn?
>

I mean jumping to the middle of ld_imm64 is similar to jumping out-of-bound,
both are CFG-related issues and can be handled early in one place.

For the case you mentioned, the current code would handle such an issue in
check_ld_imm64(), and again gives "BAD_LD_IMM" log, which is strange.

> > Also, as you mentioned, libbpf relies on non-existing helpers, not jump
> > to the middle of ld_imm64. It seems better and easier to not leave this
> > hole.
>
> Thanks,
> Daniel