Re: [PATCH v2] perf test: Make leafloop workload immune to compiler options
From: Ian Rogers
Date: Mon May 11 2026 - 11:44:27 EST
On Mon, May 11, 2026 at 2:19 AM James Clark <james.clark@xxxxxxxxxx> wrote:
>
> Since the leafloop test program was moved into the main Perf binary as a
> workload, it inherited the same compiler options as Perf. In this case
> the -fstack-protector option broke the assumption that simple leaf
> frames don't have a stack frame on Arm. This causes
> test_arm_callgraph_fp.sh to pass even if the stack isn't augmented with
> the link register, making the test useless.
>
> Fix it by rewriting the leaf function in assembly seeing as it's so
> simple. Adding -fno-stack-protector would also work, but wouldn't be
> robust against other future compiler option additions.
>
> The local variables and 'a' variable were never needed so remove them to
> simplify.
>
> Assisted-by: GitHub-Copilot:GPT-5.5
> Signed-off-by: James Clark <james.clark@xxxxxxxxxx>
> ---
> Changes in v2:
> - Push and pop asm sections - (Sashiko)
> - Add .size directive - (Sashiko)
> - Add asm label for done and test with LTO enabled - (Sashiko)
> - Link to v1: https://lore.kernel.org/r/20260508-james-perf-leafloop-stack-v1-1-637c260b2da8@xxxxxxxxxx
> ---
> tools/perf/tests/workloads/leafloop.c | 40 +++++++++++++++++++++++++++--------
> 1 file changed, 31 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/tests/workloads/leafloop.c b/tools/perf/tests/workloads/leafloop.c
> index f7561767e32c..c20c75f7ba49 100644
> --- a/tools/perf/tests/workloads/leafloop.c
> +++ b/tools/perf/tests/workloads/leafloop.c
> @@ -6,26 +6,48 @@
> #include "../tests.h"
>
> /* We want to check these symbols in perf script */
> -noinline void leaf(volatile int b);
> -noinline void parent(volatile int b);
> +noinline void leaf(void);
> +noinline void parent(void);
>
> -static volatile int a;
> -static volatile sig_atomic_t done;
> +static volatile sig_atomic_t done asm("leafloop_done");
>
> static void sighandler(int sig __maybe_unused)
> {
> done = 1;
> }
>
> -noinline void leaf(volatile int b)
> +#if defined(__aarch64__)
> +/*
> + * Write leaf() in assembly so it stays as a minimal leaf function with no
> + * stack frame and won't get silently broken in the future by any Perf wide
> + * compilation options like -fstack-protector-all.
> + */
> +asm(
> + ".pushsection .text,\"ax\",%progbits\n"
> + ".global leaf\n"
> + ".type leaf, %function\n"
> + "leaf:\n"
> + " adrp x1, leafloop_done\n"
> + " ldr w2, [x1, #:lo12:leafloop_done]\n"
> + " cbz w2, leaf\n"
> + " ret\n"
> + ".size leaf, .-leaf\n"
> + ".popsection\n"
> +);
On reading this I thought, why can't we just use an asm block in a
function, but I get it, you want specific function entry/exit to test
the leaf unwinding feature.
Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
Fwiw, looking at the test the command line is somewhat unusual:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/shell/test_arm_callgraph_fp.sh?h=perf-tools-next#n32
```
perf record -o "$PERF_DATA" --call-graph fp -e cycles//u
--user-callchains -- $TEST_PROGRAM
```
cycles//u rather than cycles:u and why the extra --user-callchains?
Thanks,
Ian
> +
> +#else
> +
> +noinline void leaf(void)
> {
> while (!done)
> - a += b;
> + ;
> }
>
> -noinline void parent(volatile int b)
> +#endif
> +
> +noinline void parent(void)
> {
> - leaf(b);
> + leaf();
> }
>
> static int leafloop(int argc, const char **argv)
> @@ -39,7 +61,7 @@ static int leafloop(int argc, const char **argv)
> signal(SIGALRM, sighandler);
> alarm(sec);
>
> - parent(sec);
> + parent();
> return 0;
> }
>
>
> ---
> base-commit: 8c8f2093614373ea8179b562320212a25cf937c0
> change-id: 20260508-james-perf-leafloop-stack-c221600eddf2
>
> Best regards,
> --
> James Clark <james.clark@xxxxxxxxxx>
>