Re: [PATCH v4 09/17] x86/entry: Add new, comprehensible entry and exit hooks

From: Andy Lutomirski
Date: Thu Jul 02 2015 - 12:03:47 EST


On Thu, Jul 2, 2015 at 2:48 AM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Mon, Jun 29, 2015 at 12:33:41PM -0700, Andy Lutomirski wrote:
>> The current entry and exit code is incomprehensible, appears to work
>> primary by luck, and is very difficult to incrementally improve. Add
>> new code in preparation for simply deleting the old code.
>>
>> prepare_exit_to_usermode is a new function that will handle all slow
>> path exits to user mode. It is called with IRQs disabled and it
>> leaves us in a state in which it is safe to immediately return to
>> user mode. IRQs must not be re-enabled at any point after
>> prepare_exit_to_usermode returns and user mode is actually entered.
>> (We can, of course, fail to enter user mode and treat that failure
>> as a fresh entry to kernel mode.) All callers of do_notify_resume
>> will be migrated to call prepare_exit_to_usermode instead;
>> prepare_exit_to_usermode needs to do everything that
>> do_notify_resume does, but it also takes care of scheduling and
>> context tracking. Unlike do_notify_resume, it does not need to be
>> called in a loop.
>>
>> syscall_return_slowpath is exactly what it sounds like. It will be
>> called on any syscall exit slow path. It will replaces
>> syscall_trace_leave and it calls prepare_exit_to_usermode on the way
>> out.
>>
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> ---
>> arch/x86/entry/common.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 111 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
>> index 8a7e35af7164..55530d6dd1bd 100644
>> --- a/arch/x86/entry/common.c
>> +++ b/arch/x86/entry/common.c
>> @@ -207,6 +207,7 @@ long syscall_trace_enter(struct pt_regs *regs)
>> return syscall_trace_enter_phase2(regs, arch, phase1_result);
>> }
>>
>> +/* Deprecated. */
>> void syscall_trace_leave(struct pt_regs *regs)
>
> Ah yes, this will get replaced later with syscall_return_slowpath below.

I already have code in my tree to delete this, but it needs careful
testing for the 32-bit case. The asm changes ended up being more
intrusive than the 64-bit change, and I had to fiddle with vm86 as
well.

>
> Stupid question: what assures us that we'll break out of this loop
> at some point? I.e., isn't the scenario possible of something always
> setting bits in ->flags while we're handling stuff in the IRQs on
> section?

Nothing, actually. We could spin forever if we keep scheduling in and
having TIF_NEED_RESCHED get set before we check again, or we could
even, in principle, keep delivering more and more signals forever.
This seems quite unlikely, though, and if we really end up delivering
infinite signals, something is very wrong already.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/