Re: [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction
From: Russell King (Oracle)
Date: Mon Apr 22 2024 - 07:15:01 EST
On Fri, Apr 19, 2024 at 06:28:32PM +0000, Puranjay Mohan wrote:
> The current implementation of the mov instruction with sign extension
> has the following problems:
>
> 1. It clobbers the source register if it is not stacked because it
> sign extends the source and then moves it to the destination.
> 2. If the dst_reg is stacked, the current code doesn't write the value
> back in case of 64-bit mov.
> 3. There is room for improvement by emitting fewer instructions.
>
> The steps for fixing this and the instructions emitted by the JIT are
> explained below with examples in all combinations:
>
> Case A: offset == 32:
> =====================
> Case A.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Store tmp_lo into dst_lo
> 3. Sign extend tmp_lo into tmp_hi
> 4. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r3
> r3 is a stacked register
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> // str to dst_lo is not emitted because src_lo == dst_lo
> asr r7, r6, #31 // Sign extend tmp_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi into r3_hi
>
> Case A.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo into dst_hi
>
> Example: r6 = (s32)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo into r6_lo
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case A.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Store src_lo into dst_lo
> 2. Sign extend src_lo into tmp_hi
> 3. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r6
> r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
>
> str r4, [r11, #-16] // Store r6_lo to r3_lo
> asr r7, r4, #31 // Sign extend r6_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to dest_hi
>
> Case A.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Mov src_lo into dst_lo
> 2. Sign extend src_lo into dst_hi
>
> Example: (bf) r6 = (s32)r6
> r6 maps to {ARM_R5, ARM_R4}
>
> // Mov not emitted because dst == src
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B: offset != 32:
> =====================
> Case B.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Sign extend tmp_lo according to offset.
> 3. Store tmp_lo into dst_lo
> 4. Sign extend tmp_lo into tmp_hi
> 5. Store tmp_hi to dst_hi
>
> Example: r9 = (s8)r3
> r9 and r3 are both stacked registers
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> lsl r6, r6, #24 // Sign extend tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-56] // Store tmp_lo to r9_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-52] // Store tmp_hi to r9_hi
>
> Case B.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo according to offset.
> 3. Sign extend tmp_lo into dst_hi
>
> Example: r6 = (s8)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo to r6_lo
> lsl r4, r4, #24 // Sign extend r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Sign extend src_lo into tmp_lo according to offset.
> 2. Store tmp_lo into dst_lo.
> 3. Sign extend src_lo into tmp_hi.
> 4. Store tmp_hi to dst_hi.
>
> Example: r3 = (s8)r1
> r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r6, r2, #24 // Sign extend r1_lo to tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-16] // Store tmp_lo to r3_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to r3_hi
>
> Case B.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Sign extend src_lo into dst_lo according to offset.
> 2. Sign extend dst_lo into dst_hi.
>
> Example: r6 = (s8)r1
> r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r4, r2, #24 // Sign extend r1_lo to r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo to r6_hi
>
> Fixes: fc832653fa0d ("arm32, bpf: add support for sign-extension mov instruction")
> Reported-by: syzbot+186522670e6722692d86@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@xxxxxxxxxx/
> Signed-off-by: Puranjay Mohan <puranjay@xxxxxxxxxx>
This looks really great, thanks for putting in the extra effort!
Reviewed-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>
Thanks!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!