Re: [PATCH v7 net-next 1/3] filter: add Extended BPF interpreter and converter

From: Alexei Starovoitov
Date: Sun Mar 09 2014 - 14:03:00 EST


On Sun, Mar 9, 2014 at 7:49 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> On Sat, 2014-03-08 at 15:15 -0800, Alexei Starovoitov wrote:
>
>> + if (BPF_SRC(fp->code) == BPF_K &&
>> + (int)fp->k < 0) {
>> + /* extended BPF immediates are signed,
>> + * zero extend immediate into tmp register
>> + * and use it in compare insn
>> + */
>> + insn->code = BPF_ALU | BPF_MOV | BPF_K;
>> + insn->a_reg = 2;
>> + insn->imm = fp->k;
>> + insn++;
>> +
>> + insn->a_reg = 6;
>> + insn->x_reg = 2;
>> + bpf_src = BPF_X;
>> + } else {
>> + insn->a_reg = 6;
>> + insn->x_reg = 7;
>> + insn->imm = fp->k;
>> + bpf_src = BPF_SRC(fp->code);
>> + }
>> + /* common case where 'jump_false' is next insn */
>> + if (fp->jf == 0) {
>> + insn->code = BPF_JMP | BPF_OP(fp->code) |
>> + bpf_src;
>> + tgt = i + fp->jt + 1;
>> + EMIT_JMP;
>> + break;
>> + }
>> + /* convert JEQ into JNE when 'jump_true' is next insn */
>> + if (fp->jt == 0 && BPF_OP(fp->code) == BPF_JEQ) {
>> + insn->code = BPF_JMP | BPF_JNE | bpf_src;
>> + tgt = i + fp->jf + 1;
>> + EMIT_JMP;
>> + break;
>> + }
>> + /* other jumps are mapped into two insns: Jxx and JA */
>> + tgt = i + fp->jt + 1;
>> + insn->code = BPF_JMP | BPF_OP(fp->code) | bpf_src;
>> + EMIT_JMP;
>> +
>> + insn++;
>> + insn->code = BPF_JMP | BPF_JA;
>> + tgt = i + fp->jf + 1;
>> + EMIT_JMP;
>> + break;
>> +
>> + /* ldxb 4*([14]&0xf) is remaped into 3 insns */
>> + case BPF_LDX | BPF_MSH | BPF_B:
>> + insn->code = BPF_LD | BPF_ABS | BPF_B;
>> + insn->a_reg = 7;
>> + insn->imm = fp->k;
>> +
>> + insn++;
>> + insn->code = BPF_ALU | BPF_AND | BPF_K;
>> + insn->a_reg = 7;
>> + insn->imm = 0xf;
>> +
>> + insn++;
>> + insn->code = BPF_ALU | BPF_LSH | BPF_K;
>> + insn->a_reg = 7;
>> + insn->imm = 2;
>> + break;
>> +
>> + /* RET_K, RET_A are remaped into 2 insns */
>> + case BPF_RET | BPF_A:
>> + case BPF_RET | BPF_K:
>> + insn->code = BPF_ALU | BPF_MOV |
>> + (BPF_RVAL(fp->code) == BPF_K ? BPF_K : BPF_X);
>> + insn->a_reg = 0;
>> + insn->x_reg = 6;
>> + insn->imm = fp->k;
>> +
>> + insn++;
>> + insn->code = BPF_RET | BPF_K;
>> + break;
>
>
> What the hell is this ?
>
> All this magical values, like 2, 6, 7, 10.

they are register numbers, since they are assigned into 'a_reg' and 'x_reg'
which are described in uapi/filter.h:
__u8 a_reg:4; /* dest register */
__u8 x_reg:4; /* source register */
and in Doc...filter.txt

In the V1 series I had a bunch of #define like:
#define R1 1
#define R2 2
which seemed as silly as doing '#define one 1'

I thought that the sk_convert_filter() code is pretty clear in terms
of what it's doing, but I'm happy to add an extensive comment to
describe the mechanics.
Also it felt that most of the time you and other folks want me to remove
comments, so I figured I'll add comments on demand.
Here looks like it's the case.

> I am afraid nobody will be able to read this but you.

that's certainly not the intent. I've presented it at the last plumbers conf
and would like to share more, since I think ebpf is a fundamental
breakthrough that can be used by many kernel subsystems.
This patch only covers old filters and seccomp.
We can do a lot more interesting things with tracing+ebpf and so on.

Regards,
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/