[PATCH v2] x86/ibpb: Skip IBPB when we switch back to same user process

From: Tim Chen
Date: Thu Jan 25 2018 - 18:57:40 EST


Thanks to the reviewers and Andy Lutomirski for the suggestion of
using ctx_id which got rid of the problem of mm pointer recycling.
Here's an update of this patch based on Andy's suggestion.

We could switch to a kernel idle thread and then back to the original
process such as:
process A -> idle -> process A

In such scenario, we do not have to do IBPB here even though the process is
non-dumpable, as we are switching back to the same process after
an hiatus.

We track the last mm user context id before we switch to init_mm by calling
leave_mm when tlb_defer_switch_to_init_mm returns false (pcid available).

The cost is to have an extra u64 mm context id to track the last mm we were using before
switching to the init_mm used by idle. Avoiding the extra IBPB
is probably worth the extra memory for this common scenario.

For those cases where tlb_defer_switch_to_init_mm returns true (non pcid),
lazy tlb will defer switch to init_mm, so we will not be changing
the mm for the process A -> idle -> process A switch. So
IBPB will be skipped for this case.

v2:
1. Save last user context id instead of last user mm to avoid the problem of recycled mm

Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
---
arch/x86/include/asm/tlbflush.h | 2 ++
arch/x86/mm/tlb.c | 23 ++++++++++++++++-------
2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 3effd3c..4405c4b 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -174,6 +174,8 @@ struct tlb_state {
struct mm_struct *loaded_mm;
u16 loaded_mm_asid;
u16 next_asid;
+ /* last user mm's ctx id */
+ u64 last_ctx_id;

/*
* We can be in one of several states:
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 33f5f97..2179b90 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -220,6 +220,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
} else {
u16 new_asid;
bool need_flush;
+ u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);

/*
* Avoid user/user BTB poisoning by flushing the branch predictor
@@ -230,14 +231,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
* switching into processes that disable dumping.
*
* This will not flush branches when switching into kernel
- * threads, but it would flush them when switching to the
- * idle thread and back.
- *
- * It might be useful to have a one-off cache here
- * to also not flush the idle case, but we would need some
- * kind of stable sequence number to remember the previous mm.
+ * threads. It will also not flush if we switch to idle
+ * thread and back to the same process. It will flush if we
+ * switch to a different non-dumpable process.
*/
- if (tsk && tsk->mm && get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ if (tsk && tsk->mm &&
+ tsk->mm->context.ctx_id != last_ctx_id &&
+ get_dumpable(tsk->mm) != SUID_DUMP_USER)
indirect_branch_prediction_barrier();

if (IS_ENABLED(CONFIG_VMAP_STACK)) {
@@ -288,6 +288,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
}

+ /*
+ * Record last user mm's context id, so we can avoid
+ * flushing branch buffer with IBPB if we switch back
+ * to the same user.
+ */
+ if (next != &init_mm)
+ this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
+
this_cpu_write(cpu_tlbstate.loaded_mm, next);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid);
}
@@ -365,6 +373,7 @@ void initialize_tlbstate_and_flush(void)
write_cr3(build_cr3(mm->pgd, 0));

/* Reinitialize tlbstate. */
+ this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
this_cpu_write(cpu_tlbstate.next_asid, 1);
this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
--
2.9.4