Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

From: Andy Lutomirski
Date: Fri Jan 06 2017 - 17:24:28 EST


On Fri, Jan 6, 2017 at 10:03 AM, Thomas Garnier <thgarnie@xxxxxxxxxx> wrote:
> On Thu, Jan 5, 2017 at 10:49 PM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>>
>> * Thomas Garnier <thgarnie@xxxxxxxxxx> wrote:
>>
>>> >> Not sure I fully understood and I don't want to miss an important point. Do
>>> >> you mean making GDT (remapping and per-cpu) read-only and switch the
>>> >> writeable flag only when we write to the per-cpu entry?
>>> >
>>> > What I mean is: write to the GDT through normal percpu access (or whatever the
>>> > normal mapping is) but load a read-only alias into the GDT register. As long
>>> > as nothing ever tries to write through the GDTR alias, no page faults will be
>>> > generated. So we just need to make sure that nothing ever writes to it
>>> > through GDTR. AFAIK the only reason the CPU ever writes to the address in
>>> > GDTR is to set an accessed bit.
>>>
>>> A write is made when we use load_TR_desc (ltr). I didn't see any other yet.
>>
>> Is this write to the GDT, generated by the LTR instruction, done unconditionally
>> by the hardware?
>>
>
> That was my experience. I didn't look into details. Do you think we
> could change something so that ltr never writes to the GDT? (just mark
> the TSS entry busy).

No, and I had the way this worked on 64-bit wrong. LTR requires an
available TSS and changes it to busy. So here are my thoughts on how
this should work:

Let's get rid of any connection between this code and KASLR. Every
time KASLR makes something work differently, a kitten turns all
SchrÃdinger on us. This is moving the GDT to the fixmap, plain and
simple. For now, make it one page per CPU and don't worry about the
GDT limit.

On 32-bit, we're going to have to make the fixmap GDT be read-write
because making it read-only will break double-fault handling.

On 64-bit, we can use your trick of temporarily mapping the GDT
read-write every time we load TR, which should happen very rarely.
Alternatively, we can reload the *GDT* every time we reload TR, which
should be comparably slow. This is going to regress performance in
the extremely rare case where KVM exits to a process that uses
ioperm() (I think), but I doubt anyone cares. Or maybe we could
arrange to never reload TR when GDT points at the fixmap by having KVM
set the host GDT to the direct version and letting KVM's code to
reload the GDT switch to the fixmap copy.

If we need a quirk to keep the fixmap copy read-write, so be it.

None of this should depend on KASLR. IMO it should happen unconditionally.

Once all if it works, then we can build on it to allocate four pages
per CPU (with the extra three pointing to the zero page) and speeding
up KVM.

--Andy