Re: [RFC][PATCH v2 11/21] x86/pti: Extend PTI user mappings

From: Andy Lutomirski
Date: Tue Nov 17 2020 - 10:50:12 EST


On Tue, Nov 17, 2020 at 12:42 AM Alexandre Chartre
<alexandre.chartre@xxxxxxxxxx> wrote:
>
>
> On 11/17/20 12:06 AM, Andy Lutomirski wrote:
> > On Mon, Nov 16, 2020 at 12:18 PM Alexandre Chartre
> > <alexandre.chartre@xxxxxxxxxx> wrote:
> >>
> >>
> >> On 11/16/20 8:48 PM, Andy Lutomirski wrote:
> >>> On Mon, Nov 16, 2020 at 6:49 AM Alexandre Chartre
> >>> <alexandre.chartre@xxxxxxxxxx> wrote:
> >>>>
> >>>> Extend PTI user mappings so that more kernel entry code can be executed
> >>>> with the user page-table. To do so, we need to map syscall and interrupt
> >>>> entry code, per cpu offsets (__per_cpu_offset, which is used some in
> >>>> entry code), the stack canary, and the PTI stack (which is defined per
> >>>> task).
> >>>
> >>> Does anything unmap the PTI stack? Mapping is easy, and unmapping
> >>> could be a pretty big mess.
> >>>
> >>
> >> No, there's no unmap. The mapping exists as long as the task page-table
> >> does (i.e. as long as the task mm exits). I assume that the task stack
> >> and mm are freed at the same time but that's not something I have checked.
> >>
> >
> > Nope. A multi-threaded mm will free task stacks when the task exits,
> > but the mm may outlive the individual tasks. Additionally, if you
> > allocate page tables as part of mapping PTI stacks, you need to make
> > sure the pagetables are freed.
>
> So I think I just need to unmap the PTI stack from the user page-table
> when the task exits. Everything else is handled because the kernel and
> PTI stack are allocated in a single chunk (referenced by task->stack).
>
>
> > Finally, you need to make sure that
> > the PTI stacks have appropriate guard pages -- just doubling the
> > allocation is not safe enough.
>
> The PTI stack does have guard pages because it maps only a part of the task
> stack into the user page-table, so pages around the PTI stack are not mapped
> into the user-pagetable (the page below is the task stack guard, and the page
> above is part of the kernel-only stack so it's never mapped into the user
> page-table).
>
> + * +-------------+
> + * | | ^ ^
> + * | kernel-only | | KERNEL_STACK_SIZE |
> + * | stack | | |
> + * | | V |
> + * +-------------+ <- top of kernel stack | THREAD_SIZE
> + * | | ^ |
> + * | kernel and | | KERNEL_STACK_SIZE |
> + * | PTI stack | | |
> + * | | V v
> + * +-------------+ <- top of stack

There's no guard page between the stacks. That seems unfortunate.

>
> > My intuition is that this is going to be far more complexity than is justified.
>
> Sounds like only the PTI stack unmap is missing, which is hopefully not
> that bad. I will check that.
>
> alex.