Re: [RFC PATCH bpf-next 1/3] arm64: insn: Fix two bugs in encoding 32-bit logical immediates

From: Will Deacon
Date: Fri May 08 2020 - 07:47:19 EST


On Thu, May 07, 2020 at 02:48:07PM -0700, Luke Nelson wrote:
> Thanks for the comments! Responses below:
>
> > It's a bit grotty spreading the checks out now. How about we tweak things
> > slightly along the lines of:
> >
> >
> > diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> > index 4a9e773a177f..60ec788eaf33 100644
> > --- a/arch/arm64/kernel/insn.c
> > +++ b/arch/arm64/kernel/insn.c
> > [...]
>
> Agreed; this new version looks much cleaner. I re-ran all the tests /
> verification and everything seems good. Would you like me to submit a
> v2 of this series with this new code?

Yes, please! And please include Daniel's acks on the BPF changes too. It's a
public holiday here in the UK today, but I can pick this up next week.

> >> We tested the new code against llvm-mc with all 1,302 encodable 32-bit
> >> logical immediates and all 5,334 encodable 64-bit logical immediates.
> >
> > That, on its own, is awesome information. Do you have any pointer on
> > how to set this up?
>
> Sure! The process of running the tests is pretty involved, but I'll
> describe it below and give some links here.
>
> We found the bugs in insn.c while adding support for logical immediates
> to the BPF JIT and verifying the changes with our tool, Serval:
> https://github.com/uw-unsat/serval-bpf. The basic idea for how we tested /
> verified logical immediates is the following:
>
> First, we have a Python script [1] for generating every encodable
> logical immediate and the corresponding instruction fields that encode
> that immediate. The script validates the list by checking that llvm-mc
> decodes each instruction back to the expected immediate.
>
> Next, we use the list [2] from the first step to check a Racket
> translation [3] of the logical immediate encoding function in insn.c.
> We found the second mask bug by noticing that some (encodable) 32-bit
> immediates were being rejected by the encoding function.
>
> Last, we use the Racket translation of the encoding function to verify
> the correctness of the BPF JIT implementation [4], i.e., the JIT
> correctly compiles BPF_{AND,OR,XOR,JSET} BPF_K instructions to arm64
> instructions with equivalent semantics. We found the first bug as the
> verifier complained that the function was producing invalid encodings
> for 32-bit -1 immediates, and we were able to reproduce a kernel crash
> using the BPF tests.
>
> We manually translated the C code to Racket because our verifier, Serval,
> currently only works on Racket code.

Nice! Two things:

(1) I really think you should give a talk on this at a Linux conference
(2) Did you look at any instruction generation functions other than the
logical immediate encoding function?

Cheers,

Will