Re: [PATCH 2/2] ARM: futex: make futex_detect_cmpxchg more reliable
From: Russell King - ARM Linux admin
Date: Fri Mar 08 2019 - 05:57:15 EST
On Fri, Mar 08, 2019 at 11:16:47AM +0100, Ard Biesheuvel wrote:
> Compiling the following code
>
> """
> #include <stdio.h>
>
> static void foo(void *a, int b)
> {
> asm("str %0, [%1]" :: "r"(a), "r"(b));
> }
>
> int main(void)
> {
> foo(NULL, 0);
> }
> """
>
> with GCC 6.3 (at -O2) gives me
>
> .arch armv7-a
> .eabi_attribute 28, 1
> .eabi_attribute 20, 1
> .eabi_attribute 21, 1
> .eabi_attribute 23, 3
> .eabi_attribute 24, 1
> .eabi_attribute 25, 1
> .eabi_attribute 26, 2
> .eabi_attribute 30, 2
> .eabi_attribute 34, 1
> .eabi_attribute 18, 4
> .file "futex.c"
> .section .text.startup,"ax",%progbits
> .align 1
> .p2align 2,,3
> .global main
> .syntax unified
> .thumb
> .thumb_func
> .fpu vfpv3-d16
> .type main, %function
> main:
> @ args = 0, pretend = 0, frame = 0
> @ frame_needed = 0, uses_anonymous_args = 0
> @ link register save eliminated.
> movs r0, #0
> .syntax unified
> @ 6 "/tmp/futex.c" 1
> str r0, [r0]
> @ 0 "" 2
> .thumb
> .syntax unified
> bx lr
> .size main, .-main
> .ident "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
> .section .note.GNU-stack,"",%progbits
>
> and so GCC definitely behaves similar in this regard.
Let's take this further - a volatile is required for these cases to
avoid gcc eliminating the asm() due to the output not being used:
#define NULL ((void *)0)
static void foo(void *a, int b)
{
asm volatile("str %1, [%0]" : "=&r" (a) : "0" (a), "r" (b));
}
int main(void)
{
foo(NULL, 0);
}
produces:
mov r3, #0
mov r2, r3
str r2, [r2]
which looks to me to be incorrect to the GCC manual - the '&' on the
output operand should mean that it does not conflict with other input
operands, but clearly 'r2' has ended up being 'b' as well. I suspect
this is a bug, or if not, is completely counter-intuitive from the
description in the GCC manual.
Using "+r" (a) : "r" (b) also results in:
mov r3, #0
str r3, [r3]
It seems that only using "+&r" (a) : "r" (b) avoids a and b being in
the same register, but I question whether we are stepping into
undefined compiler behaviour with that.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up