Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

From: Peter Zijlstra
Date: Fri Jul 22 2022 - 06:24:24 EST


On Thu, Jul 21, 2022 at 05:16:14PM -0700, Sami Tolvanen wrote:

> That looks good to me. I updated my LLVM tree to generate this code
> for the checks:
>
> https://github.com/samitolvanen/llvm-project/commits/kcfi

Thanks!

The alignment thing you added:

// Emit int3 padding before the type information to maintain alignment.
// The X86::MOV32ri instruction we emit is 5 bytes long.
uint64_t Padding = offsetToAlignment(5, MF.getAlignment());
while (Padding--)
EmitAndCountInstruction(MCInstBuilder(X86::INT3));

Doesn't seem to quite do what we want though.

When I use -fpatchable-function-entry=16,16 we effectively get a 32 byte
prefix on every function:

0000000000000000 <__cfi___traceiter_sched_kthread_stop>:
0: cc int3
1: cc int3
2: cc int3
3: cc int3
4: cc int3
5: cc int3
6: cc int3
7: cc int3
8: cc int3
9: cc int3
a: cc int3
b: b8 26 b1 df 98 mov $0x98dfb126,%eax
10: 90 nop
11: 90 nop
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
16: 90 nop
17: 90 nop
18: 90 nop
19: 90 nop
1a: 90 nop
1b: 90 nop
1c: 90 nop
1d: 90 nop
1e: 90 nop
1f: 90 nop

And given the parameters, that's indeed the only option. However, given
I can scribble the type thing just fine when moving to FineIBT and the
whole Skylake depth tracking only needs 10 bytes, I figured I'd try:
-fpatchable-function-entry=11,11 instead. But that resulted in
unalignment:

0000000000000000 <__cfi___traceiter_sched_kthread_stop>:
0: cc int3
1: cc int3
2: cc int3
3: cc int3
4: cc int3
5: cc int3
6: cc int3
7: cc int3
8: cc int3
9: cc int3
a: cc int3
b: b8 26 b1 df 98 mov $0x98dfb126,%eax
10: 90 nop
11: 90 nop
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
16: 90 nop
17: 90 nop
18: 90 nop
19: 90 nop
1a: 90 nop

000000000000001b <__traceiter_sched_kthread_stop>:

However, if I change clang like so:

llvm/lib/Target/X86/X86AsmPrinter.cpp | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index 789597f8ef1a..6c94313a197d 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -124,9 +124,15 @@ void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF,
OutStreamer->emitSymbolAttribute(FnSym, MCSA_ELF_TypeFunction);
OutStreamer->emitLabel(FnSym);

+ int64_t PrefixNops = 0;
+ (void)MF.getFunction()
+ .getFnAttribute("patchable-function-prefix")
+ .getValueAsString()
+ .getAsInteger(10, PrefixNops);
+
// Emit int3 padding before the type information to maintain alignment.
// The X86::MOV32ri instruction we emit is 5 bytes long.
- uint64_t Padding = offsetToAlignment(5, MF.getAlignment());
+ uint64_t Padding = offsetToAlignment(5+PrefixNops, MF.getAlignment());
while (Padding--)
EmitAndCountInstruction(MCInstBuilder(X86::INT3));


Then it becomes:

0000000000000000 <__cfi___traceiter_sched_kthread_stop>:
0: b8 26 b1 df 98 mov $0x98dfb126,%eax
5: 90 nop
6: 90 nop
7: 90 nop
8: 90 nop
9: 90 nop
a: 90 nop
b: 90 nop
c: 90 nop
d: 90 nop
e: 90 nop
f: 90 nop

0000000000000010 <__traceiter_sched_kthread_stop>:

and things are 'good' again, except for functions that don't get a kcfi
preamble, those are unaligned... I couldn't find where the
patchable-function-prefix nops are generated to fix this up :/


Also; could you perhaps add a switch to supress ENDBR for functions with
a kCFI preamble ?