Re: Updated version of RD/WR FS/GS BASE patchkit
From: Andy Lutomirski
Date: Mon Mar 21 2016 - 18:06:14 EST
On Mon, Mar 21, 2016 at 12:40 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>> You're adding an hwcap bit because you expect user code to use this
>> thing, which means you're adding an ABI, which means that the
>> semantics should be given due consideration.
>
> Right I did that and concluded the existing semantics are fine.
> They also worked fine for many years with the system call.
>
> We have two different modi:
>
> - Code uses old FS/GS selector, gs selector is not zero
> In this case the selector base in GDT/LDT takes preference.
In this case the selector base in GDT/LDT is the whole story because
arch_prctl zeroes the selector.
>
> This is legacy, but still works fine.
>
> - Code uses 64bit base, either through arch_prctl or the new
> instructions. In this case FS/GS selector has to be zero.
>
> This is the new expected mode for 64bit code.
>
> With the new instructions the modi can be temporarily
> out of sync (GS/FS != 0, but a different base loaded),
> but will always be reset on the next context switch.
>
> Your previous objection was that this allows to detect
> context switches, but that's already possible in other
> ways so I think it's a red hering.
>
> Also if you really want to change it you can do so
> in a followon patch under your own name.
ARCH_SET_FS and ARCH_SET_GS *zero the selector*. WRFSBASE and
WRGSBASE *do not zero the selector*. This design is, in my mind,
obnoxious and represents an error on Intel's part, but it's what the
docs say the cpu does and I have no reason to doubt the docs.
So a patchset to enable these asinine new instructions needs to take
this into account, and the ABI issue needs to be addressed, even if
the answer is that the proposed code is fine.
(Also, the existing code is fscked up. Guess what xor %eax, %eax; mov
%ax, %gs does to the base on AMD? The existing code is *wrong*, and I
don't want to see it get wronger.)
And no, I don't really care about programs detecting context switches.
I do, however, care about allowing non-determinism in things that
ought to behave deterministically. Writing a nonzero value to %gs and
then doing WRGSBASE is something that user code will be able to do
whether we like it or not, some shitty threading library is likely to
do this just to spite us, the the kernel needs to do *something* when
this happens.