Re: [PATCH 11/14] llvm: kCFI pointer stuff

From: Alexei Starovoitov
Date: Mon Sep 30 2024 - 12:59:38 EST


On Mon, Sep 30, 2024 at 1:27 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Sun, Sep 29, 2024 at 10:53:05AM -0700, Alexei Starovoitov wrote:
>
> > > + // Extend the kCFI meta-data with a byte that has a bit set for each argument
> > > + // register that contains a pointer. Specifically for x86_64, which has 6
> > > + // argument registers:
> > > + //
> > > + // bit0 - rdi
> > > + // bit1 - rsi
> > > + // bit2 - rdx
> > > + // bit3 - rcx
> > > + // bit4 - r8
> > > + // bit5 - r9
> > > + //
> > > + // bit6 will denote any pointer on stack (%rsp), and all 7 bits set will
> > > + // indicate vararg or any other 'unknown' configuration. Leaving bit7 for
> > > + // error states.
> > > + //
> > > + // XXX: should be conditional on some new x86_64 specific 'bhi' argument.
> > > + EmitAndCountInstruction(MCInstBuilder(X86::MOV8ri)
> > > + .addReg(X86::AL)
> > > + .addImm(getKCFIPointerArgs(F)));
> >
> > If I'm reading this correctly it will be an 8-bit move which
> > doesn't clear upper bits.
> > If consumer is in assembly it's ok-ish,
> > but it's an argument to __bhi_args_foo functions,
> > so should be properly zero extended per call convention.
>
> These kCFI 'instructions' are never executed. Their sole purpose is to
> encode the immediates. They are instructions because they live in .text
> and having them this way makes disassemly work nicely. As such, we've
> taken to using the 1 byte move instruction to carry them with the least
> amounts of bytes.
>
> The consumer is the kernel instruction decoder, we take the immediate
> and use that.

I see... and after decoding imm bits in mov %al insn the kernel will
insert a call to corresponding __bhi_args_* stub that will use
cmovne on corresponding register(s) to sanitize the value?
That was difficult to grasp.
A design doc would have helped.

I wonder whether this whole complexity is worth it vs
always calling __bhi_args_all()