Re: [PATCH 1/1] sched/headers: fix thread_info.<first> is overwritten by STACK_END_MAGIC

From: Kees Cook
Date: Thu Nov 29 2018 - 16:23:24 EST


On Tue, Nov 27, 2018 at 8:38 PM Wang, Dongsheng
<dongsheng.wang@xxxxxxxxxxxxxxxx> wrote:
>
> Hello Kees,
>
> On 2018/11/28 6:38, Kees Cook wrote:
> > On Thu, Nov 22, 2018 at 11:54 PM, Wang Dongsheng
> > <dongsheng.wang@xxxxxxxxxxxxxxxx> wrote:
> >> When select ARCH_TASK_STRUCT_ON_STACK the first of thread_info variable
> >> is overwritten by STACK_END_MAGIC. In fact, the ARCH_TASK_STRUCT_ON_STACK
> >> is not a real task on stack, it's only init_task on init_stack.
> >>
> >> Commit 0500871f21b2 ("Construct init thread stack in the linker script
> >> rather than by union") added this macro and put task_strcut into
> >> thread_union. This brings us the following possibilities:
> >> TASK_ON_STACK THREAD_INFO_IN_TASK STACK
> >> ----- <-- thread_info & stack
> >> N N | | --- <-- task
> >> | | | |
> >> ----- ---
> >>
> >> ----- <-- stack
> >> N Y | | --- <-- task(Including thread_info)
> >> | | | |
> >> ----- ---
> >>
> >> ----- <-- stack & task & thread_info
> >> Y N | |
> >> | |
> >> -----
> >>
> >> ----- <-- stack & task(Including thread_info)
> >> Y Y | |
> >> | |
> >> -----
> >> The kernel has handled the first two cases correctly.
> >>
> >> For the third case:
> >> TASK_ON_STACK: Y. THREAD_INFO_IN_TASK: N. this case
> >> should never happen, because the task and thread_info will overlap. So
> >> when TASK_ON_STACK is selected, THREAD_INFO_IN_TASK must be selected too.
> >>
> >> For the fourth case:
> >> When task on stack, the end of stack should add a sizeof(task_struct) offset.
> >>
> >> This patch handled with the third and fourth case.
> >>
> >> Fixes: 0500871f21b2 ("Construct init thread stack in the linker ...")
> >>
> >> Signed-off-by: Wang Dongsheng <dongsheng.wang@xxxxxxxxxxxxxxxx>
> >> Signed-off-by: Shunyong Yang <shunyong.yang@xxxxxxxxxxxxxxxx>
> >> ---
> >> arch/Kconfig | 1 +
> >> include/linux/sched/task_stack.h | 5 ++++-
> >> 2 files changed, 5 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/Kconfig b/arch/Kconfig
> >> index e1e540ffa979..0a2c73e73195 100644
> >> --- a/arch/Kconfig
> >> +++ b/arch/Kconfig
> >> @@ -251,6 +251,7 @@ config ARCH_HAS_SET_MEMORY
> >> # Select if arch init_task must go in the __init_task_data section
> >> config ARCH_TASK_STRUCT_ON_STACK
> >> bool
> >> + depends on THREAD_INFO_IN_TASK || IA64
> > The "IA64" part shouldn't be needed since IA64 already selects it.
> >
> > Since it's selected, it also can't have a depends, IIUC.
>
> Since the IA64 thread_info including task_struct, it doesn't need to
> select THREAD_INFO_IN_TASK.
> So we need to allow IA64 select ARCH_TASK_STRUCT_ON_STACK without
> THREAD_INFO.

Okay.

> >> # Select if arch has its private alloc_task_struct() function
> >> config ARCH_TASK_STRUCT_ALLOCATOR
> >> diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h
> >> index 6a841929073f..624c48defb9e 100644
> >> --- a/include/linux/sched/task_stack.h
> >> +++ b/include/linux/sched/task_stack.h
> >> @@ -7,6 +7,7 @@
> >> */
> >>
> >> #include <linux/sched.h>
> >> +#include <linux/sched/task.h>
> >> #include <linux/magic.h>
> >>
> >> #ifdef CONFIG_THREAD_INFO_IN_TASK
> >> @@ -25,7 +26,9 @@ static inline void *task_stack_page(const struct task_struct *task)
> >>
> >> static inline unsigned long *end_of_stack(const struct task_struct *task)
> >> {
> >> - return task->stack;
> >> + if (!IS_ENABLED(CONFIG_ARCH_TASK_STRUCT_ON_STACK) || task != &init_task)
> >> + return task->stack;
> >> + return (unsigned long *)(task + 1);
> >> }
> > This seems like a strange place for the change. It feels more like
> > init_task has been defined incorrectly.
>
> The init_task will put into init_stack when ARCH_TASK_STRUCT_ON_STACK is
> selected.
> include/asm-generic/vmlinux.lds.h:
> #define INIT_TASK_DATA(align) \
> . = ALIGN(align); \
> __start_init_task = .; \
> init_thread_union = .; \
> init_stack = .; \
> KEEP(*(.data..init_task)) \
> KEEP(*(.data..init_thread_info)) \
> . = __start_init_task + THREAD_SIZE; \
> __end_init_task = .;
>
> So we need end_of_stack to offset sizeof(task_struct).

Well, I guess I mean I'd rather the end_of_stack() code not be
special-cased in the if. The default should be how it was. Perhaps:

if (IS_ENABLED(CONFIG_ARCH_TASK_STRUCT_ON_STACK) && task == &init_task)
return (unsigned long *)(task + 1);
return task->stack;

But it still feels strange: why can't task->stack point to the correct
place in this special case?

--
Kees Cook