Re: [PATCH v2 3/6] treewide: Replace memcpy(..., current->comm) with strscpy()

From: Steven Rostedt

Date: Tue May 26 2026 - 19:06:37 EST


On Sun, 24 May 2026 19:38:53 -0300
André Almeida <andrealmeid@xxxxxxxxxx> wrote:

> In order to increase the size of current->comm[] and to avoid breaking any
> existing code, replace memcpy() with strscpy(). The later function makes
> sure that the copy is NUL terminated. This is crucial given that the
> source buffer might be larger than the destination buffer and could
> truncate the NUL character out of it.
>
> Signed-off-by: André Almeida <andrealmeid@xxxxxxxxxx>
> ---
> Changes from v2:
> - New patch, dropped strtostr() from last version
> ---
> include/linux/coredump.h | 2 +-
> include/linux/tracepoint.h | 4 ++--
> include/trace/events/block.h | 10 +++++-----
> include/trace/events/coredump.h | 2 +-
> include/trace/events/f2fs.h | 4 ++--
> include/trace/events/oom.h | 2 +-
> include/trace/events/osnoise.h | 2 +-
> include/trace/events/sched.h | 10 +++++-----
> include/trace/events/signal.h | 2 +-
> include/trace/events/task.h | 4 ++--
> kernel/printk/nbcon.c | 2 +-
> kernel/printk/printk.c | 2 +-
> 12 files changed, 23 insertions(+), 23 deletions(-)
>

So I was curious to what impact this would have on tracing. I decided to
run the following:

perf stat -r 100 ./hackbench 50

To see how it affects things. Hackbench is a bit of a microbenchmark but it
stresses the scheduler and thus, scheduler trace events.

I first ran the above and put the output into "stat.baseline", then I enabled
all scheduler trace events:

trace-cmd start -e sched

and ran it again and put the output into "stat.before".

I applied the patch and ran it again before enabling tracing (just to see
the variance) and put that into "stat.baseline2". I then enabled tracing
and ran it again and put the output into "stat.after".

Here's the results:

stat.baseline:

Performance counter stats for '/work/c/hackbench 50' (100 runs):

53,165 context-switches # 11002.2 cs/sec cs_per_second ( +- 1.33% )
8,010 cpu-migrations # 1657.6 migrations/sec migrations_per_second ( +- 0.90% )
53,936 page-faults # 11161.7 faults/sec page_faults_per_second ( +- 0.50% )
4,832.24 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.12% )
18,787,710 branch-misses # 1.2 % branch_miss_rate ( +- 0.17% ) (38.88%)
1,452,653,496 branches # 300.6 M/sec branch_frequency ( +- 0.14% ) (61.55%)
15,607,564,080 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.15% ) (56.21%)
7,648,608,518 instructions # 0.5 instructions insn_per_cycle ( +- 0.11% ) (55.82%)
12,025,223,911 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.14% ) (56.26%)

0.808204663 +- 0.001059873 seconds time elapsed ( +- 0.13% )

stat.before:

Performance counter stats for '/work/c/hackbench 50' (100 runs):

54,722 context-switches # 11041.0 cs/sec cs_per_second ( +- 1.35% )
8,170 cpu-migrations # 1648.4 migrations/sec migrations_per_second ( +- 1.08% )
54,295 page-faults # 10954.8 faults/sec page_faults_per_second ( +- 0.53% )
4,956.27 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.14% )
19,304,657 branch-misses # 1.2 % branch_miss_rate ( +- 0.20% ) (37.27%)
1,497,794,368 branches # 302.2 M/sec branch_frequency ( +- 0.17% ) (60.74%)
16,037,658,236 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.16% ) (57.72%)
7,875,024,533 instructions # 0.5 instructions insn_per_cycle ( +- 0.13% ) (57.83%)
12,344,722,147 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.17% ) (55.77%)

0.827636161 +- 0.001027531 seconds time elapsed ( +- 0.12% )


stat.baseline2:

Performance counter stats for '/work/c/hackbench 50' (100 runs):

52,590 context-switches # 10837.7 cs/sec cs_per_second ( +- 1.18% )
7,958 cpu-migrations # 1640.0 migrations/sec migrations_per_second ( +- 0.99% )
53,819 page-faults # 11090.9 faults/sec page_faults_per_second ( +- 0.48% )
4,852.52 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.11% )
18,933,395 branch-misses # 1.2 % branch_miss_rate ( +- 0.18% ) (37.13%)
1,451,361,950 branches # 299.1 M/sec branch_frequency ( +- 0.13% ) (60.09%)
15,683,586,735 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.13% ) (56.05%)
7,628,894,710 instructions # 0.5 instructions insn_per_cycle ( +- 0.10% ) (57.22%)
12,063,750,082 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.14% ) (57.11%)

0.811536383 +- 0.001337259 seconds time elapsed ( +- 0.16% )

stat.after:

Performance counter stats for '/work/c/hackbench 50' (100 runs):

53,799 context-switches # 10743.3 cs/sec cs_per_second ( +- 1.35% )
8,095 cpu-migrations # 1616.5 migrations/sec migrations_per_second ( +- 0.86% )
54,330 page-faults # 10849.4 faults/sec page_faults_per_second ( +- 0.55% )
5,007.67 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.13% )
19,444,339 branch-misses # 1.2 % branch_miss_rate ( +- 0.21% ) (38.04%)
1,504,382,421 branches # 300.4 M/sec branch_frequency ( +- 0.17% ) (60.42%)
16,225,153,060 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.16% ) (56.19%)
7,889,645,005 instructions # 0.5 instructions insn_per_cycle ( +- 0.16% ) (56.30%)
12,488,115,947 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.16% ) (55.55%)

0.835123855 +- 0.001015781 seconds time elapsed ( +- 0.12% )


Looking at the difference between cpu-cycles of baseline and baseline2, we have:

15,607,564,080 vs 15,683,586,735 where it went up by 0.4% (in the noise).

But when enabling tracing, we have between before and after:

16,037,658,236 vs 16,225,153,060 which is 1.1%. May be low but not insignificant.

Where tracing enabled slowed the code down by 2.7% (16,037,658,236 vs 15,607,564,080)
having another 1% is quite an impact!

As tracing now slows it down by 3.9% which is a significant increase from 2.7%

I really rather keep memcpy() here.

-- Steve