[PATCH v2] hardening: Introduce CONFIG_ZERO_CALL_USED_REGS

From: Kees Cook
Date: Wed Jul 14 2021 - 18:01:36 EST


When CONFIG_ZERO_CALL_USED_REGS is enabled, build the kernel with
"-fzero-call-used-regs=used-gpr" (in GCC 11). This option will zero any
caller-used register contents just before returning from a function,
ensuring that temporary values are not leaked beyond the function
boundary. This means that register contents are less likely to be
available for side channel attacks and information exposures.

Additionally this helps reduce the number of useful ROP gadgets in the
kernel image by about 20%:

$ ROPgadget.py --nosys --nojop --binary vmlinux.stock | tail -n1
Unique gadgets found: 337245

$ ROPgadget.py --nosys --nojop --binary vmlinux.zero-call-regs | tail -n1
Unique gadgets found: 267175

and more notably removes simple "write-what-where" gadgets:

$ ROPgadget.py --ropchain --binary vmlinux.stock | sed -n '/Step 1/,/Step 2/p'
- Step 1 -- Write-what-where gadgets

[+] Gadget found: 0xffffffff8102d76c mov qword ptr [rsi], rdx ; ret
[+] Gadget found: 0xffffffff81000cf5 pop rsi ; ret
[+] Gadget found: 0xffffffff8104d7c8 pop rdx ; ret
[-] Can't find the 'xor rdx, rdx' gadget. Try with another 'mov [reg], reg'

[+] Gadget found: 0xffffffff814c2b4c mov qword ptr [rsi], rdi ; ret
[+] Gadget found: 0xffffffff81000cf5 pop rsi ; ret
[+] Gadget found: 0xffffffff81001e51 pop rdi ; ret
[-] Can't find the 'xor rdi, rdi' gadget. Try with another 'mov [reg], reg'

[+] Gadget found: 0xffffffff81540d61 mov qword ptr [rsi], rdi ; pop rbx ; pop rbp ; ret
[+] Gadget found: 0xffffffff81000cf5 pop rsi ; ret
[+] Gadget found: 0xffffffff81001e51 pop rdi ; ret
[-] Can't find the 'xor rdi, rdi' gadget. Try with another 'mov [reg], reg'

[+] Gadget found: 0xffffffff8105341e mov qword ptr [rsi], rax ; ret
[+] Gadget found: 0xffffffff81000cf5 pop rsi ; ret
[+] Gadget found: 0xffffffff81029a11 pop rax ; ret
[+] Gadget found: 0xffffffff811f1c3b xor rax, rax ; ret

- Step 2 -- Init syscall number gadgets

$ ROPgadget.py --ropchain --binary vmlinux.zero* | sed -n '/Step 1/,/Step 2/p'
- Step 1 -- Write-what-where gadgets

[-] Can't find the 'mov qword ptr [r64], r64' gadget

For an x86_64 parallel build tests, this has a less than 1% performance
impact, and grows the image size less than 1%:

$ size vmlinux.stock vmlinux.zero-call-regs
text data bss dec hex filename
22437676 8559152 14127340 45124168 2b08a48 vmlinux.stock
22453184 8563248 14110956 45127388 2b096dc vmlinux.zero-call-regs

Impact for other architectures may vary. For example, arm64 sees a 5.5%
image size growth, mainly due to needing to always clear x16 and x17:
https://lore.kernel.org/lkml/20210510134503.GA88495@C02TD0UTHF1T.local/

Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
---
v2: adjust image size language to reflect arm64 differences (mark)
v1: https://lore.kernel.org/lkml/20210505191804.4015873-1-keescook@xxxxxxxxxxxx/
---
Makefile | 5 +++++
security/Kconfig.hardening | 19 +++++++++++++++++++
2 files changed, 24 insertions(+)

diff --git a/Makefile b/Makefile
index c3f9bd191b89..bed9c68d93d3 100644
--- a/Makefile
+++ b/Makefile
@@ -841,6 +841,11 @@ endif
# for the randomize_kstack_offset feature. Disable it for all compilers.
KBUILD_CFLAGS += $(call cc-option, -fno-stack-clash-protection)

+# Clear used registers at func exit (to reduce data lifetime and ROP gadgets).
+ifdef CONFIG_ZERO_CALL_USED_REGS
+KBUILD_CFLAGS += -fzero-call-used-regs=used-gpr
+endif
+
DEBUG_CFLAGS :=

# Workaround for GCC versions < 5.0
diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening
index a56c36470cb1..023aea5e117c 100644
--- a/security/Kconfig.hardening
+++ b/security/Kconfig.hardening
@@ -217,6 +217,25 @@ config INIT_ON_FREE_DEFAULT_ON
touching "cold" memory areas. Most cases see 3-5% impact. Some
synthetic workloads have measured as high as 8%.

+config CC_HAS_ZERO_CALL_USED_REGS
+ def_bool $(cc-option,-fzero-call-used-regs=used-gpr)
+
+config ZERO_CALL_USED_REGS
+ bool "Enable register zeroing on function exit"
+ depends on CC_HAS_ZERO_CALL_USED_REGS
+ help
+ At the end of functions, always zero any caller-used register
+ contents. This helps ensure that temporary values are not
+ leaked beyond the function boundary. This means that register
+ contents are less likely to be available for side channels
+ and information exposures. Additionally, this helps reduce the
+ number of useful ROP gadgets by about 20% (and removes compiler
+ generated "write-what-where" gadgets) in the resulting kernel
+ image. This has a less than 1% performance impact on most
+ workloads. Image size growth depends on architecture, and should
+ be evaluated for suitability. For example, x86_64 grows by less
+ than 1%, and arm64 grows by about 5%.
+
endmenu

endmenu
--
2.30.2