On Mon, Sep 28, 2015 at 11:17 AM, Chris Metcalf <cmetcalf@xxxxxxxxxx> wrote:
In prepare_exit_to_usermode(), we would like to callThis is IMO rather nasty. Can you try to find a way to do this
task_isolation_enter() on every return to userspace, and like
other work items, we would like to recheck for more work after
calling it, since it will enable interrupts internally.
However, if task_isolation_enter() is the only work item,
and it has already been called once, we don't want to continue
calling it in a loop. We don't have a dedicated TIF flag for
task isolation, and it wouldn't make sense to have one, since
we'd want to set it before starting exit every time, and then
clear it the first time around the loop.
Instead, we change the loop structure somewhat, so that we
have a more inclusive set of flags that are tested for on the
first entry to the function (including TIF_NOHZ), and if any
of those flags are set, we enter the loop. And, we do the
task_isolation() test unconditionally at the bottom of the loop,
but then when making the decision to loop back, we just use the
set of flags that doesn't include TIF_NOHZ. That way we only
loop if there is other work to do, but then if that work
is done, we again unconditionally call task_isolation_enter().
In syscall_trace_enter_phase1(), we try to add the necessary
support for strict-mode detection of syscalls in an optimized
way, by letting the code remain unchanged if we are not using
TASK_ISOLATION, but otherwise calling enter_from_user_mode()
under the first time we see _TIF_NOHZ, and then waiting until
after we do the secure computing work to actually clear the bit
from the "work" variable and call task_isolation_syscall().
Signed-off-by: Chris Metcalf <cmetcalf@xxxxxxxxxx>
---
arch/x86/entry/common.c | 47 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 36 insertions(+), 11 deletions(-)
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 80dcc9261ca3..0f74389c6f3b 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -21,6 +21,7 @@
#include <linux/context_tracking.h>
#include <linux/user-return-notifier.h>
#include <linux/uprobes.h>
+#include <linux/isolation.h>
#include <asm/desc.h>
#include <asm/traps.h>
@@ -81,7 +82,8 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
*/
if (work & _TIF_NOHZ) {
enter_from_user_mode();
- work &= ~_TIF_NOHZ;
+ if (!IS_ENABLED(CONFIG_TASK_ISOLATION))
+ work &= ~_TIF_NOHZ;
}
#endif
@@ -131,6 +133,13 @@ unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
}
#endif
+ /* Now check task isolation, if needed. */
+ if (IS_ENABLED(CONFIG_TASK_ISOLATION) && (work & _TIF_NOHZ)) {
+ work &= ~_TIF_NOHZ;
+ if (task_isolation_strict())
+ task_isolation_syscall(regs->orig_ax);
+ }
+
without making the control flow depend on config options?
What guarantees that TIF_NOHZ is an acceptable thing to check?
/* Do our best to finish without phase 2. */Too complicated and too error prone.
if (work == 0)
return ret; /* seccomp and/or nohz only (ret == 0 here) */
@@ -217,10 +226,26 @@ static struct thread_info *pt_regs_to_thread_info(struct pt_regs *regs)
/* Called with IRQs disabled. */
__visible void prepare_exit_to_usermode(struct pt_regs *regs)
{
+ u32 cached_flags;
+
if (WARN_ON(!irqs_disabled()))
local_irq_disable();
/*
+ * We may want to enter the loop here unconditionally to make
+ * sure to do some work at least once. Test here for all
+ * possible conditions that might make us enter the loop,
+ * and return immediately if none of them are set.
+ */
+ cached_flags = READ_ONCE(pt_regs_to_thread_info(regs)->flags);
+ if (!(cached_flags & (TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
+ _TIF_UPROBE | _TIF_NEED_RESCHED |
+ _TIF_USER_RETURN_NOTIFY | _TIF_NOHZ))) {
+ user_enter();
+ return;
+ }
+
In any event, I don't think that the property you actually want is for
the loop to be entered once. I think the property you want is that
we're isolated by the time we're finished. Why not just check that
directly in the loop condition?
+ /*Does anything here guarantee forward progress or at least give
* In order to return to user mode, we need to have IRQs off with
* none of _TIF_SIGPENDING, _TIF_NOTIFY_RESUME, _TIF_USER_RETURN_NOTIFY,
* _TIF_UPROBE, or _TIF_NEED_RESCHED set. Several of these flags
@@ -228,15 +253,7 @@ __visible void prepare_exit_to_usermode(struct pt_regs *regs)
* so we need to loop. Disabling preemption wouldn't help: doing the
* work to clear some of the flags can sleep.
*/
- while (true) {
- u32 cached_flags =
- READ_ONCE(pt_regs_to_thread_info(regs)->flags);
-
- if (!(cached_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME |
- _TIF_UPROBE | _TIF_NEED_RESCHED |
- _TIF_USER_RETURN_NOTIFY)))
- break;
-
+ do {
/* We have work to do. */
local_irq_enable();
@@ -258,9 +275,17 @@ __visible void prepare_exit_to_usermode(struct pt_regs *regs)
if (cached_flags & _TIF_USER_RETURN_NOTIFY)
fire_user_return_notifiers();
+ if (task_isolation_enabled())
+ task_isolation_enter();
+
reasonable confidence that we'll make forward progress?