[PATCH 11/14] llvm: kCFI pointer stuff

From: Peter Zijlstra
Date: Fri Sep 27 2024 - 15:53:24 EST


Quick hack to extend the Clang-kCFI function meta-data (u32 hash) with
a u8 bitmask of pointer arguments. This should really be under a new
compiler flag, dependent on both x86_64 and kCFI.

Per the comment, the bitmask represents the register based arguments
as the first 6 bits and bit 6 is used to cover all stack based
arguments. The high bit is used for invalid values.

The purpose is to put a store dependency on the set registers, thereby
blocking speculation paths that would otherwise expoit their value.


Note1:

This implementation simply sets the bit for any pointer type. A better
implementation would only set the bit for any argument that is
dereferenced in the function body.

This better implementation would also capture things like:

void foo(unsigned long addr, void *args)
{
u32 t = *(u32 *)addr;
bar(t, args);
}

Which, in contrast to the implementation below, would set bit0 while
leaving bit1 unset -- the exact opposite of this implementation.

Notably, addr *is* dereferenced, even though it is not a pointer on
entry, while args is a pointer, but is not derefereced but passed on
to bar -- if bar uses it, it gets to deal with it.

Note2:

Do we want to make this a u32 to keep room for all registers? AFAICT
the current use is only concerned with the argument registers and
those are limited to 6 for the C ABI, but custom (assembly) functions
could use things outside of that.

---
diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index 73c745062096..42dcbc40ab4b 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -143,11 +143,28 @@ void X86AsmPrinter::EmitKCFITypePadding(const MachineFunction &MF,
// one. Otherwise, just pad with nops. The X86::MOV32ri instruction emitted
// in X86AsmPrinter::emitKCFITypeId is 5 bytes long.
if (HasType)
- PrefixBytes += 5;
+ PrefixBytes += 7;

emitNops(offsetToAlignment(PrefixBytes, MF.getAlignment()));
}

+static uint8_t getKCFIPointerArgs(const Function &F)
+{
+ uint8_t val = 0;
+
+ if (F.isVarArg())
+ return 0x7f;
+
+ for (int i = 0; i < F.arg_size() ; i++) {
+ Argument *A = F.getArg(i);
+ Type *T = A->getType();
+ if (T->getTypeID() == Type::PointerTyID)
+ val |= 1 << std::min(i, 6);
+ }
+
+ return val;
+}
+
/// emitKCFITypeId - Emit the KCFI type information in architecture specific
/// format.
void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF) {
@@ -183,6 +200,26 @@ void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF) {
.addReg(X86::EAX)
.addImm(MaskKCFIType(Type->getZExtValue())));

+ // Extend the kCFI meta-data with a byte that has a bit set for each argument
+ // register that contains a pointer. Specifically for x86_64, which has 6
+ // argument registers:
+ //
+ // bit0 - rdi
+ // bit1 - rsi
+ // bit2 - rdx
+ // bit3 - rcx
+ // bit4 - r8
+ // bit5 - r9
+ //
+ // bit6 will denote any pointer on stack (%rsp), and all 7 bits set will
+ // indicate vararg or any other 'unknown' configuration. Leaving bit7 for
+ // error states.
+ //
+ // XXX: should be conditional on some new x86_64 specific 'bhi' argument.
+ EmitAndCountInstruction(MCInstBuilder(X86::MOV8ri)
+ .addReg(X86::AL)
+ .addImm(getKCFIPointerArgs(F)));
+
if (MAI->hasDotTypeDotSizeDirective()) {
MCSymbol *EndSym = OutContext.createTempSymbol("cfi_func_end");
OutStreamer->emitLabel(EndSym);
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index cbb012161524..c0776ef78153 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -897,7 +897,7 @@ void X86AsmPrinter::LowerKCFI_CHECK(const MachineInstr &MI) {
.addReg(AddrReg)
.addImm(1)
.addReg(X86::NoRegister)
- .addImm(-(PrefixNops + 4))
+ .addImm(-(PrefixNops + 6))
.addReg(X86::NoRegister));

MCSymbol *Pass = OutContext.createTempSymbol();