On Wed, Nov 3, 2021 at 9:23 PM Yonghong Song <yhs@xxxxxx> wrote:
asm("") indeed helped preserve the call.
[$ ~/tmp2] cat t.c
int __attribute__((noinline)) foo() { asm(""); return 1; }
int bar() { return foo() + foo(); }
[$ ~/tmp2] clang -O2 -c t.c
[$ ~/tmp2] llvm-objdump -d t.o
t.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <foo>:
0: b8 01 00 00 00 movl $1, %eax
5: c3 retq
6: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
0000000000000010 <bar>:
10: 50 pushq %rax
11: e8 00 00 00 00 callq 0x16 <bar+0x6>
16: e8 00 00 00 00 callq 0x1b <bar+0xb>
1b: b8 02 00 00 00 movl $2, %eax
20: 59 popq %rcx
21: c3 retq
[$ ~/tmp2]
Note with asm(""), foo() is called twice, but the compiler optimization
knows foo()'s return value is 1 so it did calculation at compiler time,
assign the 2 to %eax and returns.
Missed %eax=2 part...
That means that asm("") is not enough.
Maybe something like:
int __attribute__((noinline)) foo()
{
int ret = 0;
asm volatile("" : "=r"(var) : "0"(var));
return ret;
}