Re: Save a WRMSR GS.base?
From: Andrew Cooper
Date: Fri Jun 05 2026 - 11:25:18 EST
On 05/06/2026 4:13 pm, H. Peter Anvin wrote:
> On June 5, 2026 2:13:07 AM PDT, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 05/06/2026 6:05 am, H. Peter Anvin wrote:
>>> On June 4, 2026 9:38:46 PM PDT, Borislav Petkov <bp@xxxxxxxxx> wrote:
>>>> On Thu, Jun 04, 2026 at 09:30:33PM -0700, H. Peter Anvin wrote:
>>>>> On June 4, 2026 9:26:52 PM PDT, Borislav Petkov <bp@xxxxxxxxx> wrote:
>>>>>> On Thu, Jun 04, 2026 at 08:20:57PM -0700, H. Peter Anvin wrote:
>>>>>>> I guess the question is why there is a "first" one.
>>>>>> That happens when we do:
>>>>>>
>>>>>> x86_fsgsbase_load()
>>>>>>
>>>>>> loadseg(GS) -> load_gs_index() -> native_load_gs_index() ->
>>>>>> if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>>>>> native_lkgs(selector);
>>>>>>
>>>>>> then back in x86_fsgsbase_load() we do:
>>>>>>
>>>>>> __wrgsbase_inactive(next->gsbase);
>>>>>>
>>>>>> which does
>>>>>>
>>>>>> wrmsrq(MSR_KERNEL_GS_BASE, gsbase);
>>>>>>
>>>>>> on FRED.
>>>>>>
>>>>>> But LKGS already wrote MSR_KERNEL_GS_BASE...
>>>>>>
>>>>>>> Logically the sequence should be LKGS first, if needed; then WRMSR(NS). LKGS
>>>>>>> can be replaced with swapgs/mov gs/swapgs on legacy.
>>>>>> Right.
>>>>>>
>>>>>> I think avoiding that second WRMSR(MSR_KERNEL_GS_BASE) should give some perf
>>>>>> back...
>>>>>>
>>>>>> Although, I need to think how to make it pretty...
>>>>>>
>>>>> Should be doing wrmsrns...
>>>> No, I think that second WRMSR* should not happen at all if we have executed
>>>> LKGS which has already written MSR_KERNEL_GS_BASE, right?
>>>>
>>>>
>>> You can't do that (at least not without further checks) if user space has WRGSBASE enabled, since you have no guarantee that the active GS.base is consistent with GS.selector.
>>>
>>> Since GS > 3 is pretty rare in 64-bit code at least, it doesn't seem to be a code path that needs to be that heavily optimized.
>> I think you're slightly talking past each other, and I also made a
>> mistake on the original reply, so lets try rephrasing it.
>>
>> LGKS only writes a zero-extended 32bit value into KERN_GS_BASE. This is
>> because there's only 32 bits of information in the GDT/LDT.
>>
>> So the real write into KERN_GS_BASE is still needed. Sorry - you can't
>> optimise this away. Also, I'm pretty sure amluto did some x86 selftests
>> covering this last time the logic was rewritten.
>>
>>
>> As to WRMSR vs WRMSRNS, yes Intel CPUs want this to be WRMSRNS. AMD
>> don't have WRMSRNS but this particular MSR index is architecturally not
>> architecturally serialising anyway.
>>
>> ~Andrew
> It's not just a matter of it being a 32-bit base, it might not even be the correct one even so.
Indeed, GS might be an LDT selector.
~Andrew