Re: [patch V3 01/13] entry: Provide generic syscall entry functionality

From: Thomas Gleixner
Date: Mon Jul 20 2020 - 02:50:08 EST


Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>> On Jul 19, 2020, at 3:17 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>
>> ïAndy Lutomirski <luto@xxxxxxxxxx> writes:
>>>> On Sat, Jul 18, 2020 at 7:16 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>>> Andy Lutomirski <luto@xxxxxxxxxx> writes:
>>>>> FWIW, TIF_USER_RETURN_NOTIFY is a bit of an odd duck: it's an
>>>>> entry/exit word *and* a context switch word. The latter is because
>>>>> it's logically a per-cpu flag, not a per-task flag, and the context
>>>>> switch code moves it around so it's always set on the running task.
>>>>
>>>> Gah, I missed the context switch thing of that. That stuff is hideous.
>>>
>>> It's also delightful because anything that screws up that dance (such
>>> as failure to do the exit-to-usermode path exactly right) likely
>>> results in an insta-root-hole. If we fail to run user return
>>> notifiers, we can run user code with incorrect syscall MSRs, etc.
>>
>> Looking at it deeper, having that thing in the loop is a pointless
>> exercise. This really wants to be done _after_ the loop.
>>
> As long as weâre confident that nothing after the loop can set the flag again.

Yes, because that's the direct way off to user space.

Thanks,

tglx