Re: [RFC PATCH 2/6] x86/mm: temporary mm struct

From: Andy Lutomirski
Date: Wed Aug 29 2018 - 11:41:25 EST


On Wed, Aug 29, 2018 at 2:49 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> On Wed, 29 Aug 2018 01:11:43 -0700
> Nadav Amit <namit@xxxxxxxxxx> wrote:
>
>> From: Andy Lutomirski <luto@xxxxxxxxxx>
>>
>> Sometimes we want to set a temporary page-table entries (PTEs) in one of
>> the cores, without allowing other cores to use - even speculatively -
>> these mappings. There are two benefits for doing so:
>>
>> (1) Security: if sensitive PTEs are set, temporary mm prevents their use
>> in other cores. This hardens the security as it prevents exploding a
>> dangling pointer to overwrite sensitive data using the sensitive PTE.
>>
>> (2) Avoiding TLB shootdowns: the PTEs do not need to be flushed in
>> remote page-tables.
>>
>> To do so a temporary mm_struct can be used. Mappings which are private
>> for this mm can be set in the userspace part of the address-space.
>> During the whole time in which the temporary mm is loaded, interrupts
>> must be disabled.
>>
>> The first use-case for temporary PTEs, which will follow, is for poking
>> the kernel text.
>>
>> [ Commit message was written by Nadav ]
>>
>> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
>> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
>> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
>> ---
>> arch/x86/include/asm/mmu_context.h | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
>> index eeeb9289c764..96afc8c0cf15 100644
>> --- a/arch/x86/include/asm/mmu_context.h
>> +++ b/arch/x86/include/asm/mmu_context.h
>> @@ -338,4 +338,24 @@ static inline unsigned long __get_current_cr3_fast(void)
>> return cr3;
>> }
>>
>> +typedef struct {
>> + struct mm_struct *prev;
>> +} temporary_mm_state_t;
>> +
>> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm)
>> +{
>> + temporary_mm_state_t state;
>> +
>> + lockdep_assert_irqs_disabled();
>> + state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
>> + switch_mm_irqs_off(NULL, mm, current);
>> + return state;
>> +}
>
> Hmm, why don't we return mm_struct *prev directly?

I did it this way to make it easier to add future debugging stuff
later. Also, when I first wrote this, I stashed the old CR3 instead
of the old mm_struct, and it seemed like callers should be insulated
from details like this.