Re: [RFC PATCH 1/1] Revert "rseq/selftests: arm: use udf instruction for RSEQ_SIG"

From: Will Deacon
Date: Mon Jun 24 2019 - 13:24:40 EST


On Mon, Jun 17, 2019 at 05:23:04PM +0200, Mathieu Desnoyers wrote:
> This reverts commit 2b845d4b4acd9422bbb668989db8dc36dfc8f438.
>
> That commit introduces build issues for programs compiled in Thumb mode.
> Rather than try to be clever and emit a valid trap instruction on arm32,
> which requires special care about big/little endian handling on that
> architecture, just emit plain data. Data in the instruction stream is
> technically expected on arm32: this is how literal pools are
> implemented. Reverting to the prior behavior does exactly that.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CC: Joel Fernandes <joelaf@xxxxxxxxxx>
> CC: Catalin Marinas <catalin.marinas@xxxxxxx>
> CC: Dave Watson <davejwatson@xxxxxx>
> CC: Will Deacon <will.deacon@xxxxxxx>
> CC: Shuah Khan <shuah@xxxxxxxxxx>
> CC: Andi Kleen <andi@xxxxxxxxxxxxxx>
> CC: linux-kselftest@xxxxxxxxxxxxxxx
> CC: "H . Peter Anvin" <hpa@xxxxxxxxx>
> CC: Chris Lameter <cl@xxxxxxxxx>
> CC: Russell King <linux@xxxxxxxxxxxxxxxx>
> CC: Michael Kerrisk <mtk.manpages@xxxxxxxxx>
> CC: "Paul E . McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> CC: Paul Turner <pjt@xxxxxxxxxx>
> CC: Boqun Feng <boqun.feng@xxxxxxxxx>
> CC: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> CC: Steven Rostedt <rostedt@xxxxxxxxxxx>
> CC: Ben Maurer <bmaurer@xxxxxx>
> CC: linux-api@xxxxxxxxxxxxxxx
> CC: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> CC: Carlos O'Donell <carlos@xxxxxxxxxx>
> CC: Florian Weimer <fweimer@xxxxxxxxxx>
> ---
> tools/testing/selftests/rseq/rseq-arm.h | 52 ++-------------------------------
> 1 file changed, 2 insertions(+), 50 deletions(-)
>
> diff --git a/tools/testing/selftests/rseq/rseq-arm.h b/tools/testing/selftests/rseq/rseq-arm.h
> index 84f28f147fb6..5f262c54364f 100644
> --- a/tools/testing/selftests/rseq/rseq-arm.h
> +++ b/tools/testing/selftests/rseq/rseq-arm.h
> @@ -5,54 +5,7 @@
> * (C) Copyright 2016-2018 - Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> */
>
> -/*
> - * RSEQ_SIG uses the udf A32 instruction with an uncommon immediate operand
> - * value 0x5de3. This traps if user-space reaches this instruction by mistake,
> - * and the uncommon operand ensures the kernel does not move the instruction
> - * pointer to attacker-controlled code on rseq abort.
> - *
> - * The instruction pattern in the A32 instruction set is:
> - *
> - * e7f5def3 udf #24035 ; 0x5de3
> - *
> - * This translates to the following instruction pattern in the T16 instruction
> - * set:
> - *
> - * little endian:
> - * def3 udf #243 ; 0xf3
> - * e7f5 b.n <7f5>
> - *
> - * pre-ARMv6 big endian code:
> - * e7f5 b.n <7f5>
> - * def3 udf #243 ; 0xf3
> - *
> - * ARMv6+ -mbig-endian generates mixed endianness code vs data: little-endian
> - * code and big-endian data. Ensure the RSEQ_SIG data signature matches code
> - * endianness. Prior to ARMv6, -mbig-endian generates big-endian code and data
> - * (which match), so there is no need to reverse the endianness of the data
> - * representation of the signature. However, the choice between BE32 and BE8
> - * is done by the linker, so we cannot know whether code and data endianness
> - * will be mixed before the linker is invoked.
> - */
> -
> -#define RSEQ_SIG_CODE 0xe7f5def3
> -
> -#ifndef __ASSEMBLER__
> -
> -#define RSEQ_SIG_DATA \
> - ({ \
> - int sig; \
> - asm volatile ("b 2f\n\t" \
> - "1: .inst " __rseq_str(RSEQ_SIG_CODE) "\n\t" \
> - "2:\n\t" \
> - "ldr %[sig], 1b\n\t" \
> - : [sig] "=r" (sig)); \
> - sig; \
> - })
> -
> -#define RSEQ_SIG RSEQ_SIG_DATA
> -
> -#endif
> +#define RSEQ_SIG 0x53053053

I don't get why you're reverting back to this old signature value, when the
one we came up with will work well when interpreted as an instruction in the
*vast* majority of scenarios that people care about (A32/T32 little-endian).
I think you might be under-estimating just how dead things like BE32 really
are.

That said, when you ran into .inst.n/.inst.w issues, did you try something
along the lines of the WASM() macro we use in arch/arm/, which adds the ".w"
suffix when targetting Thumb?

Will