Re: [PATCH bpf-next v3 1/3] bpf: Detect jumping to reserved code during check_cfg()

From: Shung-Hsi Yu
Date: Thu Oct 12 2023 - 23:28:15 EST


On Thu, Oct 12, 2023 at 08:02:00AM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 12, 2023 at 1:14 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote:
> > On Wed, Oct 11, 2023 at 06:38:56AM -0700, Alexei Starovoitov wrote:
> > > On Wed, Oct 11, 2023 at 2:01 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote:
> > > >
> > > > Currently, we don't check if the branch-taken of a jump is reserved code of
> > > > ld_imm64. Instead, such a issue is captured in check_ld_imm(). The verifier
> > > > gives the following log in such case:
> > > >
> > > > func#0 @0
> > > > 0: R1=ctx(off=0,imm=0) R10=fp0
> > > > 0: (18) r4 = 0xffff888103436000 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0)
> > > > 2: (18) r1 = 0x1d ; R1_w=29
> > > > 4: (55) if r4 != 0x0 goto pc+4 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0)
> > > > 5: (1c) w1 -= w1 ; R1_w=0
> > > > 6: (18) r5 = 0x32 ; R5_w=50
> > > > 8: (56) if w5 != 0xfffffff4 goto pc-2
> > > > mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1
> > > > mark_precise: frame0: regs=r5 stack= before 6: (18) r5 = 0x32
> > > > 7: R5_w=50
> > > > 7: BUG_ld_00
> > > > invalid BPF_LD_IMM insn
> > > >
> > > > Here the verifier rejects the program because it thinks insn at 7 is an
> > > > invalid BPF_LD_IMM, but such a error log is not accurate since the issue
> > > > is jumping to reserved code not because the program contains invalid insn.
> > > > Therefore, make the verifier check the jump target during check_cfg(). For
> > > > the same program, the verifier reports the following log:
> > > >
> > > > func#0 @0
> > > > jump to reserved code from insn 8 to 7
> > > >
> > > > Signed-off-by: Hao Sun <sunhao.th@xxxxxxxxx>
> > > > ---
> > > > kernel/bpf/verifier.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index eed7350e15f4..725ac0b464cf 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -14980,6 +14980,7 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
> > > > {
> > > > int *insn_stack = env->cfg.insn_stack;
> > > > int *insn_state = env->cfg.insn_state;
> > > > + struct bpf_insn *insns = env->prog->insnsi;
> > > >
> > > > if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
> > > > return DONE_EXPLORING;
> > > > @@ -14993,6 +14994,12 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
> > > > return -EINVAL;
> > > > }
> > > >
> > > > + if (e == BRANCH && insns[w].code == 0) {
> > > > + verbose_linfo(env, t, "%d", t);
> > > > + verbose(env, "jump to reserved code from insn %d to %d\n", t, w);
> > > > + return -EINVAL;
> > > > + }
> > >
> > > I don't think we should be changing the verifier to make
> > > fuzzer logs more readable.
> >
> > Taking fuzzer out of consideration, giving users clearer explanation for
> > such verifier rejection could save a lot of head scratching.
>
> Users won't see such errors unless they are actively doing what
> is not recommended.
>
> > Compiler shouldn't generate such program, but its plausible to forget to
> > account that BPF_LD_IMM64 consists of two instructions when writing
> > assembly (especially with filter.h-like macros) and have it jump to the 2nd
> > part of BPF_LD_IMM64.
>
> Using macros to write bpf asm code is highly discouraged.
> All kinds of errors are possible.
> Bogus jump is just one of such mistakes.
> Use naked functions and inline asm in C code that
> both GCC and clang understand then you won't see bad jumps.
> See selftets/bpf/verifier_*.c as an example.

Understood, thanks for the explanation!

Found them under progs/verifier_*.c inside the bpf selftest directory.

> > > Same with patch 2. The code is fine as-is.
> >
> > The only way BPF_SIZE(insn->code) != BPF_DW conditional in check_ld_imm()
> > can be met right now is when we have a jump to the 2nd part of LD_IMM64; but
> > what this conditional actually guard against is not straight-forward and
> > quite confusing[1].
>
> There are plenty of cases in the verifier where we print
> an error message. Some of them should be impossible due
> to prior checks. In such cases we don't yell "verifier bug"
> and are not going to do that in this case either.

I agree, without patch 1 applied, the change to "verfier bug" in patch 2
doesn't make sense and is just wrong. The point I'm trying to make is that
the checks done by verifier are generally clear, you can make sense of why
certain check are in place just by looking at the code, but
BPF_SIZE(insn->code) != BPF_DW is _not_ one of them.

I got confused, (reading between the lines I believe) this had Hao puzzled,
and even Yongsong had to look twice[1] back then; so this check is certainly
not on-par with others we have in the verifier in terms of clarity, which
leads to patches here as well as mine a while back.

Perhaps we could reconsider making it more obvious how verifier prevents
jump to reserved code/2nd instruction of LD_IMM64?


1: the same https://lore.kernel.org/bpf/0cf50c32-ab67-ef23-7b84-ef1d4e007c33@xxxxxx/