Re: [PATCH v2 3/6] treewide: Replace memcpy(..., current->comm) with strscpy()
From: Steven Rostedt
Date: Tue May 26 2026 - 19:06:37 EST
On Sun, 24 May 2026 19:38:53 -0300
André Almeida <andrealmeid@xxxxxxxxxx> wrote:
> In order to increase the size of current->comm[] and to avoid breaking any
> existing code, replace memcpy() with strscpy(). The later function makes
> sure that the copy is NUL terminated. This is crucial given that the
> source buffer might be larger than the destination buffer and could
> truncate the NUL character out of it.
>
> Signed-off-by: André Almeida <andrealmeid@xxxxxxxxxx>
> ---
> Changes from v2:
> - New patch, dropped strtostr() from last version
> ---
> include/linux/coredump.h | 2 +-
> include/linux/tracepoint.h | 4 ++--
> include/trace/events/block.h | 10 +++++-----
> include/trace/events/coredump.h | 2 +-
> include/trace/events/f2fs.h | 4 ++--
> include/trace/events/oom.h | 2 +-
> include/trace/events/osnoise.h | 2 +-
> include/trace/events/sched.h | 10 +++++-----
> include/trace/events/signal.h | 2 +-
> include/trace/events/task.h | 4 ++--
> kernel/printk/nbcon.c | 2 +-
> kernel/printk/printk.c | 2 +-
> 12 files changed, 23 insertions(+), 23 deletions(-)
>
So I was curious to what impact this would have on tracing. I decided to
run the following:
perf stat -r 100 ./hackbench 50
To see how it affects things. Hackbench is a bit of a microbenchmark but it
stresses the scheduler and thus, scheduler trace events.
I first ran the above and put the output into "stat.baseline", then I enabled
all scheduler trace events:
trace-cmd start -e sched
and ran it again and put the output into "stat.before".
I applied the patch and ran it again before enabling tracing (just to see
the variance) and put that into "stat.baseline2". I then enabled tracing
and ran it again and put the output into "stat.after".
Here's the results:
stat.baseline:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
53,165 context-switches # 11002.2 cs/sec cs_per_second ( +- 1.33% )
8,010 cpu-migrations # 1657.6 migrations/sec migrations_per_second ( +- 0.90% )
53,936 page-faults # 11161.7 faults/sec page_faults_per_second ( +- 0.50% )
4,832.24 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.12% )
18,787,710 branch-misses # 1.2 % branch_miss_rate ( +- 0.17% ) (38.88%)
1,452,653,496 branches # 300.6 M/sec branch_frequency ( +- 0.14% ) (61.55%)
15,607,564,080 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.15% ) (56.21%)
7,648,608,518 instructions # 0.5 instructions insn_per_cycle ( +- 0.11% ) (55.82%)
12,025,223,911 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.14% ) (56.26%)
0.808204663 +- 0.001059873 seconds time elapsed ( +- 0.13% )
stat.before:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
54,722 context-switches # 11041.0 cs/sec cs_per_second ( +- 1.35% )
8,170 cpu-migrations # 1648.4 migrations/sec migrations_per_second ( +- 1.08% )
54,295 page-faults # 10954.8 faults/sec page_faults_per_second ( +- 0.53% )
4,956.27 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.14% )
19,304,657 branch-misses # 1.2 % branch_miss_rate ( +- 0.20% ) (37.27%)
1,497,794,368 branches # 302.2 M/sec branch_frequency ( +- 0.17% ) (60.74%)
16,037,658,236 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.16% ) (57.72%)
7,875,024,533 instructions # 0.5 instructions insn_per_cycle ( +- 0.13% ) (57.83%)
12,344,722,147 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.17% ) (55.77%)
0.827636161 +- 0.001027531 seconds time elapsed ( +- 0.12% )
stat.baseline2:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
52,590 context-switches # 10837.7 cs/sec cs_per_second ( +- 1.18% )
7,958 cpu-migrations # 1640.0 migrations/sec migrations_per_second ( +- 0.99% )
53,819 page-faults # 11090.9 faults/sec page_faults_per_second ( +- 0.48% )
4,852.52 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.11% )
18,933,395 branch-misses # 1.2 % branch_miss_rate ( +- 0.18% ) (37.13%)
1,451,361,950 branches # 299.1 M/sec branch_frequency ( +- 0.13% ) (60.09%)
15,683,586,735 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.13% ) (56.05%)
7,628,894,710 instructions # 0.5 instructions insn_per_cycle ( +- 0.10% ) (57.22%)
12,063,750,082 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.14% ) (57.11%)
0.811536383 +- 0.001337259 seconds time elapsed ( +- 0.16% )
stat.after:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
53,799 context-switches # 10743.3 cs/sec cs_per_second ( +- 1.35% )
8,095 cpu-migrations # 1616.5 migrations/sec migrations_per_second ( +- 0.86% )
54,330 page-faults # 10849.4 faults/sec page_faults_per_second ( +- 0.55% )
5,007.67 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.13% )
19,444,339 branch-misses # 1.2 % branch_miss_rate ( +- 0.21% ) (38.04%)
1,504,382,421 branches # 300.4 M/sec branch_frequency ( +- 0.17% ) (60.42%)
16,225,153,060 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.16% ) (56.19%)
7,889,645,005 instructions # 0.5 instructions insn_per_cycle ( +- 0.16% ) (56.30%)
12,488,115,947 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.16% ) (55.55%)
0.835123855 +- 0.001015781 seconds time elapsed ( +- 0.12% )
Looking at the difference between cpu-cycles of baseline and baseline2, we have:
15,607,564,080 vs 15,683,586,735 where it went up by 0.4% (in the noise).
But when enabling tracing, we have between before and after:
16,037,658,236 vs 16,225,153,060 which is 1.1%. May be low but not insignificant.
Where tracing enabled slowed the code down by 2.7% (16,037,658,236 vs 15,607,564,080)
having another 1% is quite an impact!
As tracing now slows it down by 3.9% which is a significant increase from 2.7%
I really rather keep memcpy() here.
-- Steve