Re: [PATCH 12/31] x86/entry/32: Add PTI cr3 switch to non-NMI entry/exit points

From: Waiman Long
Date: Thu Mar 01 2018 - 09:33:21 EST


On 03/01/2018 08:34 AM, Joerg Roedel wrote:
> On Tue, Feb 27, 2018 at 02:18:36PM -0500, Waiman Long wrote:
>>> + /* Make sure we are running on kernel cr3 */
>>> + SWITCH_TO_KERNEL_CR3 scratch_reg=%eax
>>> +
>>> xorl %edx, %edx # error code 0
>>> movl %esp, %eax # pt_regs pointer
>>>
>> The debug exception calls ret_from_exception on exit. If coming from
>> userspace, the C function prepare_exit_to_usermode() will be called.
>> With the PTI-32 code, it means that function will be called with the
>> entry stack instead of the task stack. This can be problematic as macro
>> like current won't work anymore.
> Okay, I had another look at the debug handler. As I said before, it
> already handles the from-entry-stack case, but with these patches it
> gets more likely that we actually hit that path.
>
> Also, with the special handling for from-kernel-with-entry-stack
> situations we can simplify the debug handler and make it more robust
> with the diff below. Thoughts?
>
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index 8c149f5..844aff1 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1318,33 +1318,14 @@ ENTRY(debug)
> ASM_CLAC
> pushl $-1 # mark this as an int
>
> - SAVE_ALL
> + SAVE_ALL switch_stacks=1
> ENCODE_FRAME_POINTER
>
> - /* Make sure we are running on kernel cr3 */
> - SWITCH_TO_KERNEL_CR3 scratch_reg=%eax
> -
> xorl %edx, %edx # error code 0
> movl %esp, %eax # pt_regs pointer
>
> - /* Are we currently on the SYSENTER stack? */
> - movl PER_CPU_VAR(cpu_entry_area), %ecx
> - addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx
> - subl %eax, %ecx /* ecx = (end of entry_stack) - esp */
> - cmpl $SIZEOF_entry_stack, %ecx
> - jb .Ldebug_from_sysenter_stack
> -
> - TRACE_IRQS_OFF
> - call do_debug
> - jmp ret_from_exception
> -
> -.Ldebug_from_sysenter_stack:
> - /* We're on the SYSENTER stack. Switch off. */
> - movl %esp, %ebx
> - movl PER_CPU_VAR(cpu_current_top_of_stack), %esp
> TRACE_IRQS_OFF
> call do_debug
> - movl %ebx, %esp
> jmp ret_from_exception
> END(debug)
>
>
>
I think that should fix the issue of debug exception from userspace.

One thing that I am not certain about is whether debug exception can
happen even if the IF flag is cleared. If it can, debug exception should
be handled like NMI as the state of the CR3 can be indeterminate if the
exception happens in the entry/exit code.

Cheers,
Longman