[PATCH 09/19] perf sched replay: Fix the segmentation fault problem caused by pr_err in threads

From: Arnaldo Carvalho de Melo
Date: Wed Apr 08 2015 - 10:24:40 EST


From: Yunlong Song <yunlong.song@xxxxxxxxxx>

The pr_err in self_open_counters() prints error message to stderr.
Unlike stdout, stderr uses memory buffer on the stack of each calling
process.

The pr_err in self_open_counters() works in a thread called thread_func
created in function create_tasks, which concurrently creates
sched->nr_tasks threads.

If the error happens and pr_err prints the error message in each of
these threads, the stack size of the perf process (default is 8192
kbytes) will quickly run out and the segmentation fault will happen
then.

To solve this problem, pr_err with self_open_counters() should be moved
from newly created threads to the old main thread of the perf process.
Then the pr_err can work in a stable situation without the strange
segmentation fault problem.

Example:

Test environment: x86_64 with 160 cores

Before this patch:

$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
Segmentation fault

After this patch:

$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
...

As shown above, the result continues without any segmentation fault.

Signed-off-by: Yunlong Song <yunlong.song@xxxxxxxxxx>
Cc: Paul Mackerras <paulus@xxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Wang Nan <wangnan0@xxxxxxxxxx>
Link: http://lkml.kernel.org/r/1427809596-29559-6-git-send-email-yunlong.song@xxxxxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
---
tools/perf/builtin-sched.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index dd714818fa4d..7fe3b3cb4cc8 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -472,6 +472,7 @@ static u64 get_cpu_usage_nsec_self(int fd)
struct sched_thread_parms {
struct task_desc *task;
struct perf_sched *sched;
+ int fd;
};

static void *thread_func(void *ctx)
@@ -482,13 +483,12 @@ static void *thread_func(void *ctx)
u64 cpu_usage_0, cpu_usage_1;
unsigned long i, ret;
char comm2[22];
- int fd;
+ int fd = parms->fd;

zfree(&parms);

sprintf(comm2, ":%s", this_task->comm);
prctl(PR_SET_NAME, comm2);
- fd = self_open_counters();
if (fd < 0)
return NULL;
again:
@@ -540,6 +540,7 @@ static void create_tasks(struct perf_sched *sched)
BUG_ON(parms == NULL);
parms->task = task = sched->tasks[i];
parms->sched = sched;
+ parms->fd = self_open_counters();
sem_init(&task->sleep_sem, 0, 0);
sem_init(&task->ready_for_work, 0, 0);
sem_init(&task->work_done_sem, 0, 0);
--
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/