Re: Save a WRMSR GS.base?
From: H. Peter Anvin
Date: Fri Jun 05 2026 - 11:18:45 EST
On June 5, 2026 2:13:07 AM PDT, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>On 05/06/2026 6:05 am, H. Peter Anvin wrote:
>> On June 4, 2026 9:38:46 PM PDT, Borislav Petkov <bp@xxxxxxxxx> wrote:
>>> On Thu, Jun 04, 2026 at 09:30:33PM -0700, H. Peter Anvin wrote:
>>>> On June 4, 2026 9:26:52 PM PDT, Borislav Petkov <bp@xxxxxxxxx> wrote:
>>>>> On Thu, Jun 04, 2026 at 08:20:57PM -0700, H. Peter Anvin wrote:
>>>>>> I guess the question is why there is a "first" one.
>>>>> That happens when we do:
>>>>>
>>>>> x86_fsgsbase_load()
>>>>>
>>>>> loadseg(GS) -> load_gs_index() -> native_load_gs_index() ->
>>>>> if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>>>> native_lkgs(selector);
>>>>>
>>>>> then back in x86_fsgsbase_load() we do:
>>>>>
>>>>> __wrgsbase_inactive(next->gsbase);
>>>>>
>>>>> which does
>>>>>
>>>>> wrmsrq(MSR_KERNEL_GS_BASE, gsbase);
>>>>>
>>>>> on FRED.
>>>>>
>>>>> But LKGS already wrote MSR_KERNEL_GS_BASE...
>>>>>
>>>>>> Logically the sequence should be LKGS first, if needed; then WRMSR(NS). LKGS
>>>>>> can be replaced with swapgs/mov gs/swapgs on legacy.
>>>>> Right.
>>>>>
>>>>> I think avoiding that second WRMSR(MSR_KERNEL_GS_BASE) should give some perf
>>>>> back...
>>>>>
>>>>> Although, I need to think how to make it pretty...
>>>>>
>>>> Should be doing wrmsrns...
>>> No, I think that second WRMSR* should not happen at all if we have executed
>>> LKGS which has already written MSR_KERNEL_GS_BASE, right?
>>>
>>>
>> You can't do that (at least not without further checks) if user space has WRGSBASE enabled, since you have no guarantee that the active GS.base is consistent with GS.selector.
>>
>> Since GS > 3 is pretty rare in 64-bit code at least, it doesn't seem to be a code path that needs to be that heavily optimized.
>
>I think you're slightly talking past each other, and I also made a
>mistake on the original reply, so lets try rephrasing it.
>
>LGKS only writes a zero-extended 32bit value into KERN_GS_BASE. This is
>because there's only 32 bits of information in the GDT/LDT.
>
>So the real write into KERN_GS_BASE is still needed. Sorry - you can't
>optimise this away. Also, I'm pretty sure amluto did some x86 selftests
>covering this last time the logic was rewritten.
>
>
>As to WRMSR vs WRMSRNS, yes Intel CPUs want this to be WRMSRNS. AMD
>don't have WRMSRNS but this particular MSR index is architecturally not
>architecturally serialising anyway.
>
>~Andrew
It's not just a matter of it being a 32-bit base, it might not even be the correct one even so.