Re: [PATCH v2 4/6] arm64: probes: Add GCS support to bl/blr/ret

From: Jeremy Linton
Date: Fri Apr 11 2025 - 11:10:59 EST


Hi,

On 4/7/25 11:19 AM, Jeremy Linton wrote:
The arm64 probe simulation doesn't currently have logic in place
to deal with GCS and this results in core dumps if probes are inserted
at control flow locations. Fix-up bl, blr and ret to manipulate the
shadow stack as needed.

While we manipulate and validate the shadow stack correctly, the
hardware provides additional security by only allowing GCS operations
against pages which are marked to support GCS. For writing there is
gcssttr() which enforces this, but there isn't an equivalent for
reading. This means that uprobe users should be aware that probing on
control flow instructions which require reading the shadow stack (ex:
ret) offers lower security guarantees than what is achieved without
the uprobe active.

Signed-off-by: Jeremy Linton <jeremy.linton@xxxxxxx>
---
arch/arm64/kernel/probes/simulate-insn.c | 28 ++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/probes/simulate-insn.c b/arch/arm64/kernel/probes/simulate-insn.c
index 09a0b36122d0..1fc9bb69b1eb 100644
--- a/arch/arm64/kernel/probes/simulate-insn.c
+++ b/arch/arm64/kernel/probes/simulate-insn.c
@@ -13,6 +13,7 @@
#include <asm/traps.h>
#include "simulate-insn.h"
+#include "asm/gcs.h"
#define bbl_displacement(insn) \
sign_extend32(((insn) & 0x3ffffff) << 2, 27)
@@ -49,6 +50,18 @@ static inline u32 get_w_reg(struct pt_regs *regs, int reg)
return lower_32_bits(pt_regs_read_reg(regs, reg));
}
+static inline void update_lr(struct pt_regs *regs, long addr)
+{
+ int err = 0;
+
+ if (user_mode(regs) && task_gcs_el0_enabled(current)) {
+ push_user_gcs(addr + 4, &err);
+ if (err)
+ force_sig(SIGSEGV);
+ }
+ procedure_link_pointer_set(regs, addr + 4);
+}
+
static bool __kprobes check_cbz(u32 opcode, struct pt_regs *regs)
{
int xn = opcode & 0x1f;
@@ -107,9 +120,8 @@ simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs)
{
int disp = bbl_displacement(opcode);
- /* Link register is x30 */
if (opcode & (1 << 31))
- set_x_reg(regs, 30, addr + 4);
+ update_lr(regs, addr);
instruction_pointer_set(regs, addr + disp);
}
@@ -133,17 +145,25 @@ simulate_br_blr(u32 opcode, long addr, struct pt_regs *regs)
/* update pc first in case we're doing a "blr lr" */
instruction_pointer_set(regs, get_x_reg(regs, xn));
- /* Link register is x30 */
if (((opcode >> 21) & 0x3) == 1)
- set_x_reg(regs, 30, addr + 4);
+ update_lr(regs, addr);
}
void __kprobes
simulate_ret(u32 opcode, long addr, struct pt_regs *regs)
{
+ u64 ret_addr;
+ int err = 0;
int xn = (opcode >> 5) & 0x1f;
instruction_pointer_set(regs, get_x_reg(regs, xn));
+
+ if (user_mode(regs) && task_gcs_el0_enabled(current)) {
+ ret_addr = pop_user_gcs(&err);
+ if (err || ret_addr != procedure_link_pointer(regs))
+ force_sig(SIGSEGV);
+ }
+
}

I was looking at this earlier this week and decided that this routine should have a return added following the force_sig(), and moving the instruction_pointer_set() below that. That syncs the behavior with what would happen without the uprobe active when ret_addr != LR. Then I dumped the 'xn' too.