Re: [patch 02/38] x86/cpu: Use native_wrmsrl() in load_percpu_segment()
From: Thomas Gleixner
Date: Sun Jul 17 2022 - 15:12:36 EST
On Sun, Jul 17 2022 at 00:22, Andrew Cooper wrote:
>> -void load_percpu_segment(int cpu)
>> +static noinstr void load_percpu_segment(int cpu)
>> {
>> #ifdef CONFIG_X86_32
>> loadsegment(fs, __KERNEL_PERCPU);
>> #else
>> __loadsegment_simple(gs, 0);
>> - wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
>> + /*
>> + * Because of the __loadsegment_simple(gs, 0) above, any GS-prefixed
>> + * instruction will explode right about here. As such, we must not have
>> + * any CALL-thunks using per-cpu data.
>> + *
>> + * Therefore, use native_wrmsrl() and have XenPV take the fault and
>> + * emulate.
>> + */
>> + native_wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
>> #endif
>
> Lovely :-/
>
> But I still don't see how that works, because __loadsegment_simple() is
> a memory clobber and cpu_kernelmode_gs_base() has a per-cpu lookup in
> it.
No. It uses an array lookup :)
> That said, this only has a sole caller, and in context, it's bogus for
> 64bit. Can't we fix all the problems by just doing this:
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 736262a76a12..6f393bc9d89d 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -701,16 +701,6 @@ static const char *table_lookup_model(struct
> cpuinfo_x86 *c)
> __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned
> long));
> __u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
>
> -void load_percpu_segment(int cpu)
> -{
> -#ifdef CONFIG_X86_32
> - loadsegment(fs, __KERNEL_PERCPU);
> -#else
> - __loadsegment_simple(gs, 0);
> - wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
> -#endif
> -}
> -
> #ifdef CONFIG_X86_32
> /* The 32-bit entry code needs to find cpu_entry_area. */
> DEFINE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
> @@ -742,12 +732,15 @@ EXPORT_SYMBOL_GPL(load_fixmap_gdt);
> * Current gdt points %fs at the "master" per-cpu area: after this,
> * it's on the real one.
> */
> -void switch_to_new_gdt(int cpu)
> +void __noinstr switch_to_new_gdt(int cpu)
> {
> /* Load the original GDT */
> load_direct_gdt(cpu);
> +
> +#ifdef CONFIG_X86_32
> /* Reload the per-cpu base */
> - load_percpu_segment(cpu);
> + loadsegment(fs, __KERNEL_PERCPU);
> +#endif
> }
>
> static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {};
>
>
> It's only 32bit where the percpu pointer is tied to the GDT. On 64bit,
> gsbase is good before this, and remains good after.
>
> With this change,
>
> # Make sure load_percpu_segment has no stackprotector
> CFLAGS_common.o := -fno-stack-protector
>
> comes up for re-evaluation too.
Good point. Let me stare at it some more.
Thanks,
tglx