Re: [RFC][PATCH v2 11/21] x86/pti: Extend PTI user mappings

From: Alexandre Chartre
Date: Tue Nov 17 2020 - 03:43:07 EST



On 11/17/20 12:06 AM, Andy Lutomirski wrote:
On Mon, Nov 16, 2020 at 12:18 PM Alexandre Chartre
<alexandre.chartre@xxxxxxxxxx> wrote:


On 11/16/20 8:48 PM, Andy Lutomirski wrote:
On Mon, Nov 16, 2020 at 6:49 AM Alexandre Chartre
<alexandre.chartre@xxxxxxxxxx> wrote:

Extend PTI user mappings so that more kernel entry code can be executed
with the user page-table. To do so, we need to map syscall and interrupt
entry code, per cpu offsets (__per_cpu_offset, which is used some in
entry code), the stack canary, and the PTI stack (which is defined per
task).

Does anything unmap the PTI stack? Mapping is easy, and unmapping
could be a pretty big mess.


No, there's no unmap. The mapping exists as long as the task page-table
does (i.e. as long as the task mm exits). I assume that the task stack
and mm are freed at the same time but that's not something I have checked.


Nope. A multi-threaded mm will free task stacks when the task exits,
but the mm may outlive the individual tasks. Additionally, if you
allocate page tables as part of mapping PTI stacks, you need to make
sure the pagetables are freed.

So I think I just need to unmap the PTI stack from the user page-table
when the task exits. Everything else is handled because the kernel and
PTI stack are allocated in a single chunk (referenced by task->stack).


Finally, you need to make sure that
the PTI stacks have appropriate guard pages -- just doubling the
allocation is not safe enough.

The PTI stack does have guard pages because it maps only a part of the task
stack into the user page-table, so pages around the PTI stack are not mapped
into the user-pagetable (the page below is the task stack guard, and the page
above is part of the kernel-only stack so it's never mapped into the user
page-table).

+ * +-------------+
+ * | | ^ ^
+ * | kernel-only | | KERNEL_STACK_SIZE |
+ * | stack | | |
+ * | | V |
+ * +-------------+ <- top of kernel stack | THREAD_SIZE
+ * | | ^ |
+ * | kernel and | | KERNEL_STACK_SIZE |
+ * | PTI stack | | |
+ * | | V v
+ * +-------------+ <- top of stack

My intuition is that this is going to be far more complexity than is justified.

Sounds like only the PTI stack unmap is missing, which is hopefully not
that bad. I will check that.

alex.