[PATCH] tools/perf: Remove broken __no_tail_call attribute

From: Peter Zijlstra
Date: Wed Oct 28 2020 - 17:52:46 EST


On Tue, Oct 27, 2020 at 04:11:27PM -0700, Nick Desaulniers wrote:
> On Tue, Oct 27, 2020 at 4:04 PM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
> >
> > On 10/27/20 9:57 PM, Ard Biesheuvel wrote:
> > > Commit 3193c0836f203 ("bpf: Disable GCC -fgcse optimization for
> > > ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a
> > > function scope __attribute__((optimize("-fno-gcse"))), to disable a
> > > GCC specific optimization that was causing trouble on x86 builds, and
> > > was not expected to have any positive effect in the first place.
> > >
> > > However, as the GCC manual documents, __attribute__((optimize))
> > > is not for production use, and results in all other optimization
> > > options to be forgotten for the function in question. This can
> > > cause all kinds of trouble, but in one particular reported case,
> >
> > Looks like there are couple more as well aside from __no_fgcse, are you
> > also planning to fix them?
> >
> > arch/powerpc/kernel/setup.h:14:#define __nostackprotector __attribute__((__optimize__("no-stack-protector")))
>
> GCC literally just landed support for
> __attribute__((no_stack_protector)) a few days ago. I was planning on
> sending a patch adding it to compiler_attributes.h, but we won't be
> able to rely on it for a while. Now I see I'll have to clean up ppc a
> bit. Surely they've had bugs related to optimize attribute
> unexpectedly dropping flags.
>
> > tools/include/linux/compiler-gcc.h:37:#define __no_tail_call __attribute__((optimize("no-optimize-sibling-calls")))
>
> Only used in perf?
> tools/perf/tests/dwarf-unwind.c

Right, that should probably be fixed. It also probably doesn't matter
too much since its an unwinder tests, but still, having that attribute
is dangerous.

The only cross-compiler way of doing this is like in commit
a9a3ed1eff360.

---
Subject: tools/perf: Remove broken __no_tail_call attribute

The GCC specific __attribute__((optimize)) attribute does not what is
commonly expected and is explicitly recommended against using in
production code by the GCC people.

Unlike what is often expected, it doesn't add to the optimization flags,
but it fully replaces them, loosing any and all optimization flags
provided by the compiler commandline.

The only guaranteed upon means of inhibiting tail-calls is by placing a
volatile asm with side-effects after the call such that the tail-call
simply cannot be done.

Given the original commit wasn't specific on which calls were the
problem, this removal might re-introduce the problem, which can then be
re-analyzed and cured properly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
tools/include/linux/compiler-gcc.h | 12 ------------
tools/include/linux/compiler.h | 3 ---
tools/perf/tests/dwarf-unwind.c | 10 +++++-----
3 files changed, 5 insertions(+), 20 deletions(-)

diff --git a/tools/include/linux/compiler-gcc.h b/tools/include/linux/compiler-gcc.h
index b9d4322e1e65..95c072b70d0e 100644
--- a/tools/include/linux/compiler-gcc.h
+++ b/tools/include/linux/compiler-gcc.h
@@ -27,18 +27,6 @@
#define __pure __attribute__((pure))
#endif
#define noinline __attribute__((noinline))
-#ifdef __has_attribute
-#if __has_attribute(disable_tail_calls)
-#define __no_tail_call __attribute__((disable_tail_calls))
-#endif
-#endif
-#ifndef __no_tail_call
-#if GCC_VERSION > 40201
-#define __no_tail_call __attribute__((optimize("no-optimize-sibling-calls")))
-#else
-#define __no_tail_call
-#endif
-#endif
#ifndef __packed
#define __packed __attribute__((packed))
#endif
diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index 2b3f7353e891..d22a974372c0 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -47,9 +47,6 @@
#ifndef noinline
#define noinline
#endif
-#ifndef __no_tail_call
-#define __no_tail_call
-#endif

/* Are two types/vars the same type (ignoring qualifiers)? */
#ifndef __same_type
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 2491d167bf76..83638097c3bc 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -95,7 +95,7 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
return strcmp((const char *) symbol, funcs[idx]);
}

-__no_tail_call noinline int test_dwarf_unwind__thread(struct thread *thread)
+noinline int test_dwarf_unwind__thread(struct thread *thread)
{
struct perf_sample sample;
unsigned long cnt = 0;
@@ -126,7 +126,7 @@ __no_tail_call noinline int test_dwarf_unwind__thread(struct thread *thread)

static int global_unwind_retval = -INT_MAX;

-__no_tail_call noinline int test_dwarf_unwind__compare(void *p1, void *p2)
+noinline int test_dwarf_unwind__compare(void *p1, void *p2)
{
/* Any possible value should be 'thread' */
struct thread *thread = *(struct thread **)p1;
@@ -145,7 +145,7 @@ __no_tail_call noinline int test_dwarf_unwind__compare(void *p1, void *p2)
return p1 - p2;
}

-__no_tail_call noinline int test_dwarf_unwind__krava_3(struct thread *thread)
+noinline int test_dwarf_unwind__krava_3(struct thread *thread)
{
struct thread *array[2] = {thread, thread};
void *fp = &bsearch;
@@ -164,12 +164,12 @@ __no_tail_call noinline int test_dwarf_unwind__krava_3(struct thread *thread)
return global_unwind_retval;
}

-__no_tail_call noinline int test_dwarf_unwind__krava_2(struct thread *thread)
+noinline int test_dwarf_unwind__krava_2(struct thread *thread)
{
return test_dwarf_unwind__krava_3(thread);
}

-__no_tail_call noinline int test_dwarf_unwind__krava_1(struct thread *thread)
+noinline int test_dwarf_unwind__krava_1(struct thread *thread)
{
return test_dwarf_unwind__krava_2(thread);
}