Re: [PATCH 1/4] sched: make nr_running() return 32-bit

From: Alexey Dobriyan
Date: Thu May 13 2021 - 17:22:53 EST


On Thu, May 13, 2021 at 11:58:38AM +0200, Ingo Molnar wrote:
>
> * Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>

> > Just with the default config for one of my reference machines:
> >
> > text data bss dec hex filename
> > 16679864 6627950 1671296 24979110 17d26a6 ../build/vmlinux-before
> > 16679894 6627950 1671296 24979140 17d26c4 ../build/vmlinux-after
> > ------------------------------------------------------------------------
> > +30
>
> Using '/usr/bin/size' to compare before/after generated code is the wrong
> way to measure code generation improvements for smaller changes due to
> default alignment creating a reserve of 'padding' bytes at the end of most
> functions. You have to look at the low level generated assembly directly.

This is bloat-o-meter output with Fedora 33 .config:

This is how they look like, something gets bigger but total is smaller
(otherwise why would I send it). Apparently something got one 1 byte too
many and pushed padding.

add/remove: 0/0 grow/shrink: 6/21 up/down: 18/-42 (-24)
Function old new delta
calc_load_fold_active 50 56 +6
calc_load_nohz_start 100 103 +3
calc_load_nohz_remote 85 88 +3
calc_global_load_tick 86 89 +3
pull_dl_task 901 903 +2
switched_from_dl 613 614 +1
update_rt_migration 165 164 -1
update_dl_migration 141 140 -1
ttwu_do_activate 181 180 -1
tick_nohz_idle_exit 225 224 -1
tick_irq_enter 227 226 -1
print_rt_rq.cold 238 237 -1
print_rt_rq 413 412 -1
nr_iowait_cpu 31 30 -1
init_rt_rq 143 142 -1
show_stat 1745 1743 -2
nr_running 75 73 -2
nr_iowait 83 81 -2
get_cpu_iowait_time_us 260 258 -2
get_cpu_idle_time_us 260 258 -2
find_lock_later_rq 507 505 -2
enqueue_task_rt 777 775 -2
enqueue_task_dl 2461 2459 -2
dequeue_rt_stack 576 574 -2
menu_select 1492 1489 -3
__dequeue_dl_entity 419 414 -5
init_dl_rq 88 81 -7
Total: Before=25729849, After=25729825, chg -0.00%