[PATCH v3 1/3] perf sched stats: Fix SIGCHLD vs pause() race in schedstat_record()
From: Swapnil Sapkal
Date: Wed Apr 22 2026 - 01:08:12 EST
If the profiled workload exits very quickly, SIGCHLD can be delivered
and consumed by the empty signal handler before the process enters
pause(), causing an indefinite hang.
Fix this with a simpler approach:
- When a workload is given, use waitpid() to directly wait for the
child to exit. This is race-free since waitpid() will collect the
child regardless of when it exited.
- In system-wide mode (no workload), use 'while (!done) sleep(1)'
to wait for SIGINT/SIGTERM. The signal handler now sets a
'volatile sig_atomic_t done' flag, and sleep() is interrupted
by signal delivery, so the flag is checked promptly.
Suggested-by: Namhyung Kim <namhyung@xxxxxxxxxx>
Assisted-by: Claude:claude-opus-4.6
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@xxxxxxx>
---
tools/perf/builtin-sched.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 3f509cfdd58c..cfd93bf11c2e 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -36,6 +36,7 @@
#include <linux/zalloc.h>
#include <sys/prctl.h>
#include <sys/resource.h>
+#include <sys/wait.h>
#include <inttypes.h>
#include <errno.h>
@@ -3757,8 +3758,11 @@ static int process_synthesized_schedstat_event(const struct perf_tool *tool,
return 0;
}
+static volatile sig_atomic_t done;
+
static void sighandler(int sig __maybe_unused)
{
+ done = 1;
}
static int enable_sched_schedstats(int *reset)
@@ -3899,11 +3903,15 @@ static int perf_sched__schedstat_record(struct perf_sched *sched,
if (err < 0)
goto out;
- if (argc)
- evlist__start_workload(evlist);
+ done = 0;
- /* wait for signal */
- pause();
+ if (argc) {
+ evlist__start_workload(evlist);
+ waitpid(evlist->workload.pid, NULL, 0);
+ } else {
+ while (!done)
+ sleep(1);
+ }
if (reset) {
err = disable_sched_schedstat();
--
2.43.0