* Hitoshi Mitake<mitake@xxxxxxxxxxxxxxxxxxxxx> wrote:
This patch adds --prefault option to perf bench mem memcpy.
If user specify this option to perf bench mem memcpy, overhead of
page faults will be removed from the score of memcpy().
Example of usage:
| % ./perf bench mem memcpy -l 500MB
| # Running mem/memcpy benchmark...
| # Copying 500MB Bytes from 0x7fc036749010 to 0x7fc055b4a010 ...
|
| 628.526821 MB/Sec
| mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --prefault
| # Running mem/memcpy benchmark...
| # Copying 500MB Bytes from 0x7ff1b45e2010 to 0x7ff1d39e3010 ...
|
| 4.849256 GB/Sec
Ok, looks rather useful.
We are rather close to being able to apply these bits. We need a resolution for the
arch/x86/lib/memcpy_64.S details. The ugliest are these kinds of #ifdefs:
+#ifndef PERF_BENCH
.Lmemcpy_e:
.previous
+#endif
What happens if we keep that label in place?
This:
+#ifndef PERF_BENCH
ENTRY(__memcpy)
ENTRY(memcpy)
CFI_STARTPROC
+#else
+ .globl memcpy_x86_64_unrolled
+memcpy_x86_64_unrolled:
+#endif
Could be removed if you defined an ENTRY() macro in perf, right?
This:
+#ifndef PERF_BENCH
+
CFI_ENDPROC
ENDPROC(memcpy)
ENDPROC(__memcpy)
Could be solved by defining ENDPROC()/etc. macros in perf, right?
We could remove this #ifdef:
+#ifndef PERF_BENCH
+
#include<linux/linkage.h>
#include<asm/cpufeature.h>
#include<asm/dwarf2.h>
+#endif /* PERF_BENCH */
if you added empty linkage.h, cpufeature.h and dwarf2.h files as
tools/perf/util/include/linux/linkage.h, tools/perf/util/include/asm/cpufeature.h.
That linkage.h file could even contain a short perf version of the ENTRY() macro,
etc.
That way we can avoid having to touch arch/x86/lib/memcpy_64.S altogether.