Re: [PATCH v2] perf test: Make leafloop workload immune to compiler options

From: James Clark

Date: Mon May 11 2026 - 12:42:16 EST




On 11/05/2026 4:38 pm, Ian Rogers wrote:
On Mon, May 11, 2026 at 2:19 AM James Clark <james.clark@xxxxxxxxxx> wrote:

Since the leafloop test program was moved into the main Perf binary as a
workload, it inherited the same compiler options as Perf. In this case
the -fstack-protector option broke the assumption that simple leaf
frames don't have a stack frame on Arm. This causes
test_arm_callgraph_fp.sh to pass even if the stack isn't augmented with
the link register, making the test useless.

Fix it by rewriting the leaf function in assembly seeing as it's so
simple. Adding -fno-stack-protector would also work, but wouldn't be
robust against other future compiler option additions.

The local variables and 'a' variable were never needed so remove them to
simplify.

Assisted-by: GitHub-Copilot:GPT-5.5
Signed-off-by: James Clark <james.clark@xxxxxxxxxx>
---
Changes in v2:
- Push and pop asm sections - (Sashiko)
- Add .size directive - (Sashiko)
- Add asm label for done and test with LTO enabled - (Sashiko)
- Link to v1: https://lore.kernel.org/r/20260508-james-perf-leafloop-stack-v1-1-637c260b2da8@xxxxxxxxxx
---
tools/perf/tests/workloads/leafloop.c | 40 +++++++++++++++++++++++++++--------
1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/tools/perf/tests/workloads/leafloop.c b/tools/perf/tests/workloads/leafloop.c
index f7561767e32c..c20c75f7ba49 100644
--- a/tools/perf/tests/workloads/leafloop.c
+++ b/tools/perf/tests/workloads/leafloop.c
@@ -6,26 +6,48 @@
#include "../tests.h"

/* We want to check these symbols in perf script */
-noinline void leaf(volatile int b);
-noinline void parent(volatile int b);
+noinline void leaf(void);
+noinline void parent(void);

-static volatile int a;
-static volatile sig_atomic_t done;
+static volatile sig_atomic_t done asm("leafloop_done");

static void sighandler(int sig __maybe_unused)
{
done = 1;
}

-noinline void leaf(volatile int b)
+#if defined(__aarch64__)
+/*
+ * Write leaf() in assembly so it stays as a minimal leaf function with no
+ * stack frame and won't get silently broken in the future by any Perf wide
+ * compilation options like -fstack-protector-all.
+ */
+asm(
+ ".pushsection .text,\"ax\",%progbits\n"
+ ".global leaf\n"
+ ".type leaf, %function\n"
+ "leaf:\n"
+ " adrp x1, leafloop_done\n"
+ " ldr w2, [x1, #:lo12:leafloop_done]\n"
+ " cbz w2, leaf\n"
+ " ret\n"
+ ".size leaf, .-leaf\n"
+ ".popsection\n"
+);

On reading this I thought, why can't we just use an asm block in a
function, but I get it, you want specific function entry/exit to test
the leaf unwinding feature.

Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>

Fwiw, looking at the test the command line is somewhat unusual:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/shell/test_arm_callgraph_fp.sh?h=perf-tools-next#n32
```
perf record -o "$PERF_DATA" --call-graph fp -e cycles//u
--user-callchains -- $TEST_PROGRAM
```
cycles//u rather than cycles:u and why the extra --user-callchains?

Thanks,
Ian


--user-callchains looks like it was to make the grep in the test easier so it didn't have to ignore the kernel part. But it might be redundant now after a later change I made.

cycles//u has always been there so there's no explanation. I thought that was a valid way to open an event? Is it weird because // is for options in perf record and not stat?

+
+#else
+
+noinline void leaf(void)
{
while (!done)
- a += b;
+ ;
}

-noinline void parent(volatile int b)
+#endif
+
+noinline void parent(void)
{
- leaf(b);
+ leaf();
}

static int leafloop(int argc, const char **argv)
@@ -39,7 +61,7 @@ static int leafloop(int argc, const char **argv)
signal(SIGALRM, sighandler);
alarm(sec);

- parent(sec);
+ parent();
return 0;
}


---
base-commit: 8c8f2093614373ea8179b562320212a25cf937c0
change-id: 20260508-james-perf-leafloop-stack-c221600eddf2

Best regards,
--
James Clark <james.clark@xxxxxxxxxx>