Re: [PATCH] x86: Remove force_iret()

From: Andy Lutomirski
Date: Fri Dec 20 2019 - 16:20:55 EST



> On Dec 20, 2019, at 6:59 PM, David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> ïFrom: Andy Lutomirski
>> Sent: 20 December 2019 10:30
>>>> On Dec 20, 2019, at 6:10 PM, David Laight <David.Laight@xxxxxxxxxx> wrote:
>>>
>>> ïFrom: Brian Gerst
>>>> Sent: 20 December 2019 03:48
>>>>> On Thu, Dec 19, 2019 at 8:50 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>>>
>>>>> On Thu, Dec 19, 2019 at 3:58 AM Brian Gerst <brgerst@xxxxxxxxx> wrote:
>>>>>>
>>>>>> force_iret() was originally intended to prevent the return to user mode with
>>>>>> the SYSRET or SYSEXIT instructions, in cases where the register state could
>>>>>> have been changed to be incompatible with those instructions.
>>>>>
>>>>> It's more than that. Before the big syscall rework, we didn't restore
>>>>> the caller-saved regs. See:
>>>>>
>>>>> commit 21d375b6b34ff511a507de27bf316b3dde6938d9
>>>>> Author: Andy Lutomirski <luto@xxxxxxxxxx>
>>>>> Date: Sun Jan 28 10:38:49 2018 -0800
>>>>>
>>>>> x86/entry/64: Remove the SYSCALL64 fast path
>>>>>
>>>>> So if you changed r12, for example, the change would get lost.
>>>>
>>>> force_iret() specifically dealt with changes to CS, SS and EFLAGS.
>>>> Saving and restoring the extra registers was a different problem
>>>> although it affected the same functions like ptrace, signals, and
>>>> exec.
>>>
>>> Is it ever possible for any of the segment registers to refer to the LDT
>>> and for another thread to invalidate the entries 'very late' ?
>>
>> Not in newer kernels, because the actual LDT is never modified.
>> Instead, LDT changes create a whole new LDT and propagate it with an IPI.
>
> Can the IPI be disabled through the SYSRET path?

Thereâs a whole dance in prepare_exit_to_usermode(). We turn off interrupts, then check for pending work (which does not include this IPI, but includes plenty of other nasty things), and we keep interrupts off until we are in user mode.

> Once in user space, the IPI will interrupt the process and, I presume, it will
> pick up the new LDT on 'return to user'.

The new LDT is picked up in the IPI callback.

> But if the IPI happens between the LDT being set and SYSRET it will (presumably)
> remain 'pending' until the next system call?
> Which could be long enough for one thread to have passed a pointer across giving
> an unexpected SEGV (or maybe worse, failing to give an expected one).

modify_ldt() wonât return until all threads have the new LDT.