[PATCH v4 00/17] x86: Rewrite exit-to-userspace code

From: Andy Lutomirski
Date: Mon Jun 29 2015 - 15:34:05 EST


This is the first big batch of x86 asm-to-C conversion patches.

The exit-to-usermode code is copied in several places and is written
in a nasty combination of asm and C. It's not at all clear what
it's supposed to do, and the way it's structured makes it very hard
to work with. For example, it's not even clear why syscall exit
hooks are called only once per syscall right now. (It seems to be a
side effect of the way that rdi and rdx are handled in the asm loop,
and it seems reliable, but it's still pointlessly complicated.) The
existing code also makes context tracking overly complicated and
hard to understand. Finally, it's nearly impossible for anyone to
change what happens on exit to usermode, since the existing code is
so fragile.

I tried to clean it up incrementally, but I decided it was too hard.
Instead, this series just replaces the code. It seems to work.

Context tracking in particular works very differently now. The
low-level entry code checks that we're in CONTEXT_USER and switches
to CONTEXT_KERNEL. The exit code does the reverse. There is no
need to track what CONTEXT_XYZ state we came from, because we
already know. Similarly, SCHEDULE_USER is gone, since we can
reschedule if needed by simply calling schedule() from C code.

The main things that are missing are that I haven't done the 32-bit
parts (anyone want to help?) and therefore I haven't deleted the old
C code. I also think this may break UML for trivial reasons.

IRQ context tracking is still messy. One the cleanup progresses
to the point that we can enter CONTEXT_KERNEL in syscalls before
enabling interrupts, we can fully clean up IRQ context tracking.

Once these land, I'll send some more :)

Note: we might want to backport patches 1 and 2.

Changes from v3:
- Add the syscall_arg_fault_32 test.
- Fix a pre-existing bad syscall arg buglet.
- Fix an asm glitch due to a bad rebase.
- Fix a CONFIG_PROVE_LOCKDEP warning.
Borislav: the end result of this series differs from the v3.91 that I
only in the removal of a single trailing tab. The badarg patch is in
a different place now, though, since we might want to backport it.

Changes from v2: Misplaced the actual list -- sorry.

Changes from v1:
- Fix bisection failure by squashing the 64-bit native and compat syscall
conversions together. The intermediate state didn't built, and fixing
it isn't worthwhile (the results will be harder to understand).
- Replace context_tracking_assert_state with CT_WARN_ON and ct_state.
- The last two patches are now. I incorrectly thought that we weren't
ready for them yet on 32-bit kernels, but I was wrong.

Andy Lutomirski (16):
selftests/x86: Add a test for 32-bit fast syscall arg faults
x86/entry/64/compat: Fix bad fast syscall arg failure path
context_tracking: Add ct_state and CT_WARN_ON
notifiers: Assert that RCU is watching in notify_die
x86: Move C entry and exit code to arch/x86/entry/common.c
x86/traps: Assert that we're in CONTEXT_KERNEL in exception entries
x86/entry: Add enter_from_user_mode and use it in syscalls
x86/entry: Add new, comprehensible entry and exit hooks
x86/entry/64: Really create an error-entry-from-usermode code path
x86/entry/64: Migrate 64-bit and compat syscalls to new exit hooks
x86/asm/entry/64: Save all regs on interrupt entry
x86/asm/entry/64: Simplify irq stack pt_regs handling
x86/asm/entry/64: Migrate error and interrupt exit work to C
x86/entry: Remove exception_enter from most trap handlers
x86/entry: Remove SCHEDULE_USER and asm/context-tracking.h
x86/irq: Document how IRQ context tracking works and add an assertion

Ingo Molnar (1):
uml: Fix do_signal() prototype

arch/um/include/shared/kern_util.h | 3 +-
arch/um/kernel/process.c | 6 +-
arch/um/kernel/signal.c | 8 +-
arch/um/kernel/tlb.c | 2 +-
arch/um/kernel/trap.c | 2 +-
arch/x86/entry/Makefile | 1 +
arch/x86/entry/common.c | 374 ++++++++++++++++++++++++
arch/x86/entry/entry_64.S | 180 +++---------
arch/x86/entry/entry_64_compat.S | 47 ++-
arch/x86/include/asm/context_tracking.h | 10 -
arch/x86/include/asm/signal.h | 1 +
arch/x86/include/asm/traps.h | 4 +-
arch/x86/kernel/cpu/mcheck/mce.c | 5 +-
arch/x86/kernel/cpu/mcheck/p5.c | 5 +-
arch/x86/kernel/cpu/mcheck/winchip.c | 4 +-
arch/x86/kernel/irq.c | 15 +
arch/x86/kernel/ptrace.c | 202 +------------
arch/x86/kernel/signal.c | 28 +-
arch/x86/kernel/traps.c | 87 ++----
include/linux/context_tracking.h | 15 +
include/linux/context_tracking_state.h | 1 +
kernel/notifier.c | 2 +
tools/testing/selftests/x86/Makefile | 2 +-
tools/testing/selftests/x86/syscall_arg_fault.c | 130 ++++++++
24 files changed, 672 insertions(+), 462 deletions(-)
create mode 100644 arch/x86/entry/common.c
delete mode 100644 arch/x86/include/asm/context_tracking.h
create mode 100644 tools/testing/selftests/x86/syscall_arg_fault.c

--
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/