Re: [PATCH 3/3] perf record: mmap output file

From: Mike Galbraith
Date: Wed Oct 09 2013 - 03:14:22 EST


On Tue, 2013-10-08 at 21:26 -0600, David Ahern wrote:
> When recording raw_syscalls for the entire system, e.g.,
> perf record -e raw_syscalls:*,sched:sched_switch -a -- sleep 1
>
> you end up with a negative feedback loop as perf itself calls
> write() fairly often. This patch handles the problem by mmap'ing the
> file in chunks of 64M at a time and copies events from the event buffers
> to the file avoiding write system calls.
>
> Before (with write syscall):
>
> perf record -o /tmp/perf.data -e raw_syscalls:*,sched:sched_switch -a -- sleep 1
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 81.843 MB /tmp/perf.data (~3575786 samples) ]
>
> After (using mmap):
>
> perf record -o /tmp/perf.data -e raw_syscalls:*,sched:sched_switch -a -- sleep 1
> [ perf record: Woken up 31 times to write data ]
> [ perf record: Captured and wrote 8.203 MB /tmp/perf.data (~358388 samples) ]

You cured some annoying lost chunks too.

master (cache is hot in target tip tree)

[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1398.520 MB perf.data (~61102281 samples) ]
Warning:
Processed 15894046 events and lost 15 chunks!

Check IO/CPU overload!


Performance counter stats for 'sh -c cd /usr/local/src/kernel/linux-3.x-tip/tools/perf; make clean; perf record -a -e raw_syscalls:* -- make -j4':

90405.531301 task-clock # 2.580 CPUs utilized
18433 context-switches # 0.204 K/sec
1503 cpu-migrations # 0.017 K/sec
2543758 page-faults # 0.028 M/sec
214649052820 cycles # 2.374 GHz [50.40%]
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
201026412643 instructions # 0.94 insns per cycle [75.48%]
41960433084 branches # 464.136 M/sec [75.22%]
1835477524 branch-misses # 4.37% of all branches [75.13%]

35.039394045 seconds time elapsed


patched master
marge:/usr/local/src/kernel/linux-3.x.git/tools/perf # export PERF=`pwd`/perf

[ perf record: Woken up 1070 times to write data ]
[ perf record: Captured and wrote 269.060 MB perf.data (~11755404 samples) ]

Performance counter stats for 'sh -c cd /usr/local/src/kernel/linux-3.x-tip/tools/perf; make clean; $PERF record -a -e raw_syscalls:* -- make -j4':

67553.069792 task-clock # 2.934 CPUs utilized
16183 context-switches # 0.240 K/sec
880 cpu-migrations # 0.013 K/sec
2253888 page-faults # 0.033 M/sec
159940676657 cycles # 2.368 GHz [50.20%]
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
152420909413 instructions # 0.95 insns per cycle [75.39%]
32857102452 branches # 486.389 M/sec [75.58%]
1674403244 branch-misses # 5.10% of all branches [75.15%]

23.020412569 seconds time elapsed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/