Re: [PATCH] sched: fair: fix missed CONFIG_SCHEDSTATS

From: Yafang Shao
Date: Wed Mar 06 2019 - 07:54:37 EST


On Wed, Mar 6, 2019 at 8:38 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Mar 06, 2019 at 07:49:36PM +0800, Yafang Shao wrote:
>
>
> $ grep SCHEDSTAT defconfig-build/.config
> # CONFIG_SCHEDSTATS is not set
> $ obbjdump -dr defconfig-build/kernel/sched/fair.o | awk '/>:$/ { F=$2 } /sched_stat/ { print F " " $0 }'
> <update_curr>: 24cd: R_X86_64_32S __tracepoint_sched_stat_runtime+0x28
> <update_curr>: 24d9: R_X86_64_PC32 __tracepoint_sched_stat_runtime+0x24
> $ patch -p1 < foo
> patching file kernel/sched/fair.c
> $ make O=defconfig-build kernel/sched/
> make[1]: Entering directory '/usr/src/linux-2.6/defconfig-build'
> Using .. as source for kernel
> GEN Makefile
> CALL ../scripts/checksyscalls.sh
> CALL ../scripts/atomic/check-atomics.sh
> DESCEND objtool
> CC kernel/sched/fair.o
> AR kernel/sched/built-in.a
> make[1]: Leaving directory '/usr/src/linux-2.6/defconfig-build'
> $ objdump -dr defconfig-build/kernel/sched/fair.o | awk '/>:$/ { F=$2 } /sched_stat/ { print F " " $0 }'
> $ cat foo
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 8213ff6e365d..6e5ceec3b662 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -839,7 +839,8 @@ static void update_curr(struct cfs_rq *cfs_rq)
> if (entity_is_task(curr)) {
> struct task_struct *curtask = task_of(curr);
>
> - trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
> + if (schedstat_enabled())
> + trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
> cgroup_account_cputime(curtask, delta_exec);
> account_group_exec_runtime(curtask, delta_exec);
> }
>
>
> _1_ line, where you wanted to add _6_ ugly #ifdefs

I get your point now.

Yes, these codes can be removed from the callsites in kernel/sched/fair.c,
but the definitions of these tracepoints are still there,
and then they will be exposed in /sys/kernel/debug/tracing/events/sched/.

You can try objdump the vmlinux.
$ objdump -dr kernel/sched/fair.o | awk '/>:$/ { F=$2 } /sched_stat/ {
print F " " $0 }' // nothing

$ objdump -dr vmlinux | awk '/>:$/ { F=$2 } /sched_stat/ { print F " " $0 }'
<perf_trace_sched_stat_runtime>: ffffffff810b3c30
<perf_trace_sched_stat_runtime>: // it is still defined


My guess is they will be used by perf or bpf,
so they won't be optimized out by the compiler.

Thanks
Yafang