The following commit:
b52956c perf tools: Allow multiple threads or processes in record, stat, top
introduced a bug in the thread_map code which caused
perf record -a to not setup system-wide monitoring properly.
$ taskset -c 1 noploop 1000&
$ perf record -a -C 1 sleep 10
$ perf report -D | tail -20
cycles stats:
TOTAL events: 4413
MMAP events: 4025
COMM events: 340
SAMPLE events: 48
Here I was expecting about 10,000 samples and not 48.
In system-wide mode, the PID passed to perf_event_open()
must be -1 and it was 0. That caused the kernel to setup
a per-process event on PID:0. Consequently, the number
of samples captured does not correspond to the requested
measurement.
The following one-liner fixes the problem for me with or
without -C.
I would also suggest to change the malloc() to something
that matches the struct definition. thread_map->map[] is
declared as int map[] and not pid_t map[]. If map[] can
only contain pids, then change the struct definition.
Signed-off-by: Stephane Eranian<eranian@xxxxxxxxxx>
---
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index e15983c..84d9bd78 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -229,7 +229,7 @@ static struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
if (!tid_str) {
threads = malloc(sizeof(*threads) + sizeof(pid_t));
if (threads != NULL) {
- threads->map[1] = -1;
+ threads->map[0] = -1;
threads->nr = 1;
}
return threads;