Re: [PATCH] arch: x86: power: cpu: init %gs before __restore_processor_state (clang)

From: Borislav Petkov
Date: Tue Sep 15 2020 - 14:06:45 EST


On Tue, Sep 15, 2020 at 10:26:58AM -0700, rkir@xxxxxxxxxx wrote:
> From: Haitao Shan <hshan@xxxxxxxxxx>
>
> This is a workaround which fixes triple fault
> in __restore_processor_state on clang when
> built with LTO.
>
> When load_TR_desc and load_mm_ldt are inlined into
> fix_processor_context due to LTO, they cause
> fix_processor_context (or in this case __restore_processor_state,
> as fix_processor_context was inlined into __restore_processor_state)
> to access the stack canary through %gs, but before
> __restore_processor_state has restored the previous value
> of %gs properly. LLVM appears to be inlining functions with stack
> protectors into functions compiled with -fno-stack-protector,
> which is likely a bug in LLVM's inliner that needs to be fixed.
>
> The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479
>
> Signed-off-by: Haitao Shan <hshan@xxxxxxxxxx>
> Signed-off-by: Roman Kiryanov <rkir@xxxxxxxxxx>

Ok, google guys, pls make sure you Cc LKML too as this is where *all*
patches and discussions are archived. Adding it now to Cc.

> ---
> arch/x86/power/cpu.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> index db1378c6ff26..e5677adb2d28 100644
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
> /* Needed by apm.c */
> void notrace restore_processor_state(void)
> {
> +#ifdef __clang__
> + // The following code snippet is copied from __restore_processor_state.
> + // Its purpose is to prepare GS segment before the function is called.
> +#ifdef CONFIG_X86_64
> + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
> +#else
> + loadsegment(fs, __KERNEL_PERCPU);
> + loadsegment(gs, __KERNEL_STACK_CANARY);
> +#endif
> +#endif

Ok, so why is the kernel supposed to take yet another ugly workaround
because there's a bug in the compiler?

If it is too late to fix it there, then maybe disable LTO builds for the
buggy version only.

We had a similar discussion this week and we already have one buggy
compiler to deal with and this second one is not making it any easier...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette