Re: [PATCH 11/14] llvm: kCFI pointer stuff

From: Peter Zijlstra
Date: Tue Oct 01 2024 - 06:21:40 EST


On Mon, Sep 30, 2024 at 09:59:11AM -0700, Alexei Starovoitov wrote:
> On Mon, Sep 30, 2024 at 1:27 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Sun, Sep 29, 2024 at 10:53:05AM -0700, Alexei Starovoitov wrote:
> >
> > > > + // Extend the kCFI meta-data with a byte that has a bit set for each argument
> > > > + // register that contains a pointer. Specifically for x86_64, which has 6
> > > > + // argument registers:
> > > > + //
> > > > + // bit0 - rdi
> > > > + // bit1 - rsi
> > > > + // bit2 - rdx
> > > > + // bit3 - rcx
> > > > + // bit4 - r8
> > > > + // bit5 - r9
> > > > + //
> > > > + // bit6 will denote any pointer on stack (%rsp), and all 7 bits set will
> > > > + // indicate vararg or any other 'unknown' configuration. Leaving bit7 for
> > > > + // error states.
> > > > + //
> > > > + // XXX: should be conditional on some new x86_64 specific 'bhi' argument.
> > > > + EmitAndCountInstruction(MCInstBuilder(X86::MOV8ri)
> > > > + .addReg(X86::AL)
> > > > + .addImm(getKCFIPointerArgs(F)));
> > >
> > > If I'm reading this correctly it will be an 8-bit move which
> > > doesn't clear upper bits.
> > > If consumer is in assembly it's ok-ish,
> > > but it's an argument to __bhi_args_foo functions,
> > > so should be properly zero extended per call convention.
> >
> > These kCFI 'instructions' are never executed. Their sole purpose is to
> > encode the immediates. They are instructions because they live in .text
> > and having them this way makes disassemly work nicely. As such, we've
> > taken to using the 1 byte move instruction to carry them with the least
> > amounts of bytes.
> >
> > The consumer is the kernel instruction decoder, we take the immediate
> > and use that.
>
> I see... and after decoding imm bits in mov %al insn the kernel will
> insert a call to corresponding __bhi_args_* stub that will use
> cmovne on corresponding register(s) to sanitize the value?
> That was difficult to grasp.
> A design doc would have helped.

Does something like this help?

diff --git a/arch/x86/include/asm/cfi.h b/arch/x86/include/asm/cfi.h
index 31d19c815f99..b6e7e79e79c6 100644
--- a/arch/x86/include/asm/cfi.h
+++ b/arch/x86/include/asm/cfi.h
@@ -44,11 +44,28 @@
* call *%r11
*
*
+ * IBT+:
+ *
+ * foo:
+ * endbr64 / ud1 0(%eax), %edx
+ * ... code here ...
+ * ret
+ *
+ * direct caller:
+ * call foo+4
+ *
+ * indirect caller:
+ * lea foo(%rip), %r11
+ * ...
+ * call *%r11
+ *
+ *
* kCFI:
*
* __cfi_foo:
* movl $0x12345678, %eax # kCFI signature hash
- * # 11 nops when CONFIG_CALL_PADDING
+ * movb $0x12, %al # kCFI pointer argument mask
+ * # 9 nops when CONFIG_CALL_PADDING
* foo:
* endbr64 # when IBT
* ... code here ...
@@ -91,6 +108,57 @@
* nop4
* call *%r11
*
+ *
+ * FineIBT+:
+ *
+ * __cfi_foo:
+ * endbr64
+ * subl 0x12345678, %r10d
+ * jz foo
+ * ud2
+ * nop
+ * foo:
+ * ud1 0(%eax), %edx # was endbr64
+ * foo_4:
+ * ... code here ...
+ * ret
+ *
+ * direct caller:
+ * call foo+4
+ *
+ * indirect caller:
+ * lea foo(%rip), %r11
+ * ...
+ * movl $0x12345678, %r10d
+ * subl $16, %r11
+ * nop4
+ * call *%r11
+ *
+ *
+ * FineIBT-BHI:
+ *
+ * __cfi_foo:
+ * endbr64
+ * subl 0x12345678, %r10d
+ * jz foo-1
+ * ud2
+ * foo-1:
+ * call __bhi_args_XXX # depends on kCFI pointer argument mask
+ * foo+4:
+ * ... code here ...
+ * ret
+ *
+ * direct caller:
+ * call foo+4
+ *
+ * indirect caller:
+ * lea foo(%rip), %r11
+ * ...
+ * movl $0x12345678, %r10d
+ * subl $16, %r11
+ * nop4
+ * call *%r11
+ *
*/
enum cfi_mode {
CFI_AUTO, /* FineIBT if hardware has IBT, otherwise kCFI */




> I wonder whether this whole complexity is worth it vs
> always calling __bhi_args_all()

That's one for Scott to answer; I think always doing _all will hurt
especially bad because it includes rsp.