Re: [patch 02/38] x86/cpu: Use native_wrmsrl() in load_percpu_segment()

From: Thomas Gleixner
Date: Sun Jul 17 2022 - 15:12:36 EST


On Sun, Jul 17 2022 at 00:22, Andrew Cooper wrote:
>> -void load_percpu_segment(int cpu)
>> +static noinstr void load_percpu_segment(int cpu)
>> {
>> #ifdef CONFIG_X86_32
>> loadsegment(fs, __KERNEL_PERCPU);
>> #else
>> __loadsegment_simple(gs, 0);
>> - wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
>> + /*
>> + * Because of the __loadsegment_simple(gs, 0) above, any GS-prefixed
>> + * instruction will explode right about here. As such, we must not have
>> + * any CALL-thunks using per-cpu data.
>> + *
>> + * Therefore, use native_wrmsrl() and have XenPV take the fault and
>> + * emulate.
>> + */
>> + native_wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
>> #endif
>
> Lovely :-/
>
> But I still don't see how that works, because __loadsegment_simple() is
> a memory clobber and cpu_kernelmode_gs_base() has a per-cpu lookup in
> it.

No. It uses an array lookup :)

> That said, this only has a sole caller, and in context, it's bogus for
> 64bit.  Can't we fix all the problems by just doing this:
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 736262a76a12..6f393bc9d89d 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -701,16 +701,6 @@ static const char *table_lookup_model(struct
> cpuinfo_x86 *c)
>  __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned
> long));
>  __u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
>  
> -void load_percpu_segment(int cpu)
> -{
> -#ifdef CONFIG_X86_32
> -       loadsegment(fs, __KERNEL_PERCPU);
> -#else
> -       __loadsegment_simple(gs, 0);
> -       wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
> -#endif
> -}
> -
>  #ifdef CONFIG_X86_32
>  /* The 32-bit entry code needs to find cpu_entry_area. */
>  DEFINE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
> @@ -742,12 +732,15 @@ EXPORT_SYMBOL_GPL(load_fixmap_gdt);
>   * Current gdt points %fs at the "master" per-cpu area: after this,
>   * it's on the real one.
>   */
> -void switch_to_new_gdt(int cpu)
> +void __noinstr switch_to_new_gdt(int cpu)
>  {
>         /* Load the original GDT */
>         load_direct_gdt(cpu);
> +
> +#ifdef CONFIG_X86_32
>         /* Reload the per-cpu base */
> -       load_percpu_segment(cpu);
> +       loadsegment(fs, __KERNEL_PERCPU);
> +#endif
>  }
>  
>  static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {};
>
>
> It's only 32bit where the percpu pointer is tied to the GDT.  On 64bit,
> gsbase is good before this, and remains good after.
>
> With this change,
>
> # Make sure load_percpu_segment has no stackprotector
> CFLAGS_common.o         := -fno-stack-protector
>
> comes up for re-evaluation too.

Good point. Let me stare at it some more.

Thanks,

tglx