[RFC PATCH 2/2] Fix: x86: Add missing core serializing instruction on migration

From: Mathieu Desnoyers
Date: Fri Nov 10 2017 - 16:13:16 EST


x86 has a missing core serializing instruction in migration scenarios.

Given that x86-32 can return to user-space with sysexit, and x86-64
through sysretq and sysexit, which are not core serializing, the
following user-space self-modifiying code (JIT) scenario can occur:

CPU 0 CPU 1

User-space self-modify code
Preempted
migrated ->
scheduler selects task
Return to user-space (iret or sysexit)
User-space issues sync_core()
<- migrated
scheduler selects task
Return to user-space (sysexit)
jump to modified code
Run modified code without sync_core() -> bug.

This migration pattern can return to user-space through sysexit or
sysret64, which is not core serializing, and therefore breaks sequential
consistency expectations from a single-threaded process.

Fix this issue by invoking sync_core_before_usermode() the first
time a runqueue finishes a task switch after receiving a migrated
thread.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CC: Andy Lutomirski <luto@xxxxxxxxxx>
CC: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
CC: Boqun Feng <boqun.feng@xxxxxxxxx>
CC: Andrew Hunter <ahh@xxxxxxxxxx>
CC: Maged Michael <maged.michael@xxxxxxxxx>
CC: Avi Kivity <avi@xxxxxxxxxxxx>
CC: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
CC: Paul Mackerras <paulus@xxxxxxxxx>
CC: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
CC: Dave Watson <davejwatson@xxxxxx>
CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CC: Ingo Molnar <mingo@xxxxxxxxxx>
CC: "H. Peter Anvin" <hpa@xxxxxxxxx>
CC: Andrea Parri <parri.andrea@xxxxxxxxx>
CC: Russell King <linux@xxxxxxxxxxxxxxx>
CC: Greg Hackmann <ghackmann@xxxxxxxxxx>
CC: Will Deacon <will.deacon@xxxxxxx>
CC: David Sehr <sehr@xxxxxxxxxx>
CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
CC: x86@xxxxxxxxxx
CC: linux-arch@xxxxxxxxxxxxxxx
---
kernel/sched/core.c | 7 +++++++
kernel/sched/sched.h | 1 +
2 files changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c79e94278613..4a1c9782267a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -927,6 +927,7 @@ static struct rq *move_queued_task(struct rq *rq, struct rq_flags *rf,

rq_lock(rq, rf);
BUG_ON(task_cpu(p) != new_cpu);
+ rq->need_sync_core = 1;
enqueue_task(rq, p, 0);
p->on_rq = TASK_ON_RQ_QUEUED;
check_preempt_curr(rq, p, 0);
@@ -2684,6 +2685,12 @@ static struct rq *finish_task_switch(struct task_struct *prev)
prev_state = prev->state;
vtime_task_switch(prev);
perf_event_task_sched_in(prev, current);
+#ifdef CONFIG_SMP
+ if (unlikely(rq->need_sync_core)) {
+ sync_core_before_usermode();
+ rq->need_sync_core = 0;
+ }
+#endif
finish_lock_switch(rq, prev);
finish_arch_post_lock_switch();

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index cab256c1720a..33e617bc491c 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -734,6 +734,7 @@ struct rq {
/* For active balancing */
int active_balance;
int push_cpu;
+ int need_sync_core;
struct cpu_stop_work active_balance_work;
/* cpu of this runqueue: */
int cpu;
--
2.11.0