Re: [RFC][PATCH v2 12/21] x86/pti: Use PTI stack instead of trampoline stack

From: Alexandre Chartre
Date: Thu Nov 19 2020 - 14:56:04 EST



On 11/19/20 8:10 PM, Thomas Gleixner wrote:
On Mon, Nov 16 2020 at 19:10, Alexandre Chartre wrote:
On 11/16/20 5:57 PM, Andy Lutomirski wrote:
On Mon, Nov 16, 2020 at 6:47 AM Alexandre Chartre
<alexandre.chartre@xxxxxxxxxx> wrote:
When executing more code in the kernel, we are likely to reach a point
where we need to sleep while we are using the user page-table, so we need
to be using a per-thread stack.

I can't immediately evaluate how nasty the page table setup is because
it's not in this patch.

The page-table is the regular page-table as introduced by PTI. It is just
augmented with a few additional mapping which are in patch 11 (x86/pti:
Extend PTI user mappings).

But AFAICS the only thing that this enables is sleeping with user pagetables.

That's precisely the point, it allows to sleep with the user page-table.

Coming late, but this does not make any sense to me.

Unless you map most of the kernel into the user page-table sleeping with
the user page-table _cannot_ work. And if you do that you broke KPTI.

You can neither pick arbitrary points in the C code of an exception
handler to switch to the kernel mapping unless you mapped everything
which might be touched before that into user space.

How is that supposed to work?


Sorry I mixed up a few thing; I got confused with my own code which is not a
good sign...

It's not sleeping with the user page-table which, as you mentioned, doesn't
make sense, it's sleeping with the kernel page-table but with the PTI stack.

Basically, it is:
- entering C code with (user page-table, PTI stack);
- then it switches to the kernel page-table so we have (kernel page-table, PTI stack);
- and then it switches to the kernel stack so we have (kernel page-table, kernel stack).

As this is all C code, some of which is executed with the PTI stack, we need the PTI stack
to be per-task so that the stack is preserved, in case that C code does a sleep/schedule
(no matter if this happens when using the PTI stack or the kernel stack).

alex.