Re: [RFC][PATCH 3/4] sched: Remove struct sched_class next field

From: Rasmus Villemoes
Date: Fri Dec 20 2019 - 07:14:33 EST


On 19/12/2019 22.44, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@xxxxxxxxxxx>
>
> Now that the sched_class descriptors are defined in order via the linker
> script vmlinux.lds.h, there's no reason to have a "next" pointer to the
> previous priroity structure. The order of the sturctures can be aligned as
> an array, and used to index and find the next sched_class descriptor.
>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> ---
> include/asm-generic/vmlinux.lds.h | 1 +
> kernel/sched/deadline.c | 1 -
> kernel/sched/fair.c | 1 -
> kernel/sched/idle.c | 1 -
> kernel/sched/rt.c | 1 -
> kernel/sched/sched.h | 6 +++---
> kernel/sched/stop_task.c | 1 -
> 7 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 1c14c4ddf785..f4d480c4f7c6 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -128,6 +128,7 @@
> */
> #define SCHED_DATA \
> STRUCT_ALIGN(); \
> + __start_sched_classes = .; \
> *(__idle_sched_class) \
> *(__fair_sched_class) \
> *(__rt_sched_class) \

This is broken. It works by accident on a 64 bit SMP config, since you
start at a 32 byte boundary, then include four 8-byte aligned structs,
so the second STRUCT_ALIGN (not visible in this hunk, but comes from the
STOP_SCHED_CLASS) is a no-op, and stop_sched_class ends up at the right
offset from the previous one.

But, for example, a 32 bit non-smp kernel with CONFIG_FAIR_GROUP_SCHED=y
has sizeof(struct sched_class) == 68, and

$ nm -n vmlinux | grep sched_class
c0728660 D idle_sched_class
c0728660 D __start_sched_classes
c07286a4 D fair_sched_class
c07286e8 D rt_sched_class
c0728740 D dl_sched_class
c0728740 D sched_class_highest

notice dl_sched_class is 88 bytes beyond rt_sched_class, while the
others are properly 68-byte separated.

So just drop the second STRUCT_ALIGN (and maybe the first as well).
Maybe throw in some ASSERTs in the linker script, but since the linker
doesn't know sizeof(struct sched_class), the best one can do is perhaps
some kind of ASSERT(fair_sched_class - idle_sched_class ==
rt_sched_class - fair_sched_class). And/or include a BUG_ON that checks
that the sched_class elements actually constitute a proper "struct
sched_class[]" array.

Rasmus