On Sat, 11 Sep 2021 21:50:43 +0800
Yinan Liu <yinan@xxxxxxxxxxxxxxxxx> wrote:
When ftrace is enabled, ftrace_init will consume a period ofWhat compiler are you using? Because by default, gcc should already do
time, usually around 15~20ms. Approximately 60% of the time is
consumed by nop-processing. Moving the nop-processing to the
compile time can speed up the kernel boot process.
performance test:
env: Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz
method: before and after patching, compare the
total time of ftrace_init(), and verify
the functionality of ftrace.
avg_time of ftrace_init:
with patch: 7.114ms
without patch: 15.763ms
this for you. In fact, recordmcount isn't even called with the latest
gcc, as gcc creates mcount_loc and inserts nops.
This was implemented before, but because we use to have "ideal nops"
that was determined at run time, because the different CPUs had
different efficiency on what nop was used, we had to do it at run time.
But that is no longer the case today, so we can revisit this.
Signed-off-by: Yinan Liu <yinan@xxxxxxxxxxxxxxxxx>We don't list archs in generic files. The above needs to be something like:
---
kernel/trace/ftrace.c | 4 ++++
scripts/recordmcount.h | 14 ++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index c236da868990..ae3fba331179 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6261,6 +6261,10 @@ static int ftrace_process_locs(struct module *mod,
* until we are finished with it, and there's no
* reason to cause large interrupt latencies while we do it.
*/
+#if defined CONFIG_X86 || defined CONFIG_X86_64 || defined CONFIG_ARM || defined CONFIG_ARM64
#ifdef ARCH_HAS_MCOUNT_NOP
or some name like that, and then that macro gets defined by the arch
header (include/asm/ftrace.h)
+ ret = 0;space should be here.
+ goto out;
+#endif
if (!mod)-- Steve
local_irq_save(flags);
ftrace_update_code(mod, start_pg);