Re: [PATCH RESEND] sched/stats: Optimize /proc/schedstat printing

From: Peter Zijlstra

Date: Wed Oct 29 2025 - 10:08:07 EST


On Wed, Oct 29, 2025 at 01:07:15PM +0000, Dmitry Ilvokhin wrote:
> Function seq_printf supports rich format string for decimals printing,
> but there is no need for it in /proc/schedstat, since majority of the
> data is space separared decimals. Use seq_put_decimal_ull instead as
> faster alternative.
>
> Performance counter stats (truncated) for sh -c 'cat /proc/schedstat >
> /dev/null' before and after applying the patch from machine with 72 CPUs
> are below.
>
> Before:
>
> 2.94 msec task-clock # 0.820 CPUs utilized
> 1 context-switches # 340.551 /sec
> 0 cpu-migrations # 0.000 /sec
> 340 page-faults # 115.787 K/sec
> 10,327,200 instructions # 1.89 insn per cycle
> # 0.10 stalled cycles per insn
> 5,458,307 cycles # 1.859 GHz
> 1,052,733 stalled-cycles-frontend # 19.29% frontend cycles idle
> 2,066,321 branches # 703.687 M/sec
> 25,621 branch-misses # 1.24% of all branches
>
> 0.00357974 +- 0.00000209 seconds time elapsed ( +- 0.06% )
>
> After:
>
> 2.50 msec task-clock # 0.785 CPUs utilized
> 1 context-switches # 399.780 /sec
> 0 cpu-migrations # 0.000 /sec
> 340 page-faults # 135.925 K/sec
> 7,371,867 instructions # 1.59 insn per cycle
> # 0.13 stalled cycles per insn
> 4,647,053 cycles # 1.858 GHz
> 986,487 stalled-cycles-frontend # 21.23% frontend cycles idle
> 1,591,374 branches # 636.199 M/sec
> 28,973 branch-misses # 1.82% of all branches
>
> 0.00318461 +- 0.00000295 seconds time elapsed ( +- 0.09% )
>
> This is ~11% (relative) improvement in time elapsed.

Yeah, but who cares? Why do we want less obvious code for a silly stats
file?