Re: [PATCH RFC 3/4] x86/pti: don't mark the user PGD with _PAGE_NX.

From: Andy Lutomirski
Date: Mon Jan 08 2018 - 18:05:54 EST


On 01/08/2018 09:03 AM, Dave Hansen wrote:
On 01/08/2018 08:12 AM, Willy Tarreau wrote:
Since we're going to keep running on the same PGD when returning to
userspace for certain performance-critical tasks, we'll need the user
pages to be executable. So this code disables the extra protection
that was added consisting in marking user pages _PAGE_NX so that this
pgd remains usable for userspace.

Note: it isn't necessarily the best approach, but one way or another
if we want to be able to return to userspace from the kernel,
we'll have to have this executable anyway. Another approach
might consist in using another pgd for userland+kernel but
the current core really looks like an extra careful measure
to catch early bugs if any.

I don't like this.

I think the prctl() should apply to an entire process, not to a thread.
If it applies to a process, you can unpoison the PGD. I even had code
to do this in an earlier version of the (whole system) runtime PTI
on/off stuff.

Why are you even posting half-baked hacks like this now? Is there
something super-pressing about this set that we need to lock in a new
ABI now?


I vote per-thread.

Anyway, we can easily sync the NX-clearing: just catch the spurious page fault and clear the bit. Avoiding infinite loops will need a bit of thought, but it's surely doable.

Or we set a per-mm flag saying "no NX", then do synchronize_sched() or similar if we were the first to set it (or take the pagetable lock), then clear all the NX bits. Again, needs some care, but doable.

FWIW, the NX trick quite nicely emulates SMEP on non-SMEP hardware, which is fantastic for Spectre resistance and general hardening. Turning it off totally defeats that, which hurts a bit.

Also, Kees should be CC'd here.