Re: context tracking vs. syscall_trace_leave & do_notify_resume loop

From: Rik van Riel
Date: Fri May 01 2015 - 12:15:18 EST


On 05/01/2015 12:05 PM, Andy Lutomirski wrote:
> On Fri, May 1, 2015 at 9:00 AM, Rik van Riel <riel@xxxxxxxxxx> wrote:

>> I suspect we probably only need two possible function
>> calls at syscall exit time:
>>
>> 1) A function that is called with interrupts still
>> enabled, testing flags that could be set again
>> if something happens (eg. preemption) between
>> when the function is called, and we return to
>> user space.
>>
>> 2) A function that is called after the point of
>> no return, with interrupts disabled, which
>> does (mostly) small things that only happen
>> once.
>
> I think we only need one function. It would be (asm pseudocode):
>
> disable irqs;
> if (slow) {
> save extra regs;
> call function;
> restore extra regs;
> }
>
> return via opportunistic sysret path.
>
> I can't see any legitimate reason for the current mess, except that
> it's no complicated and so poorly documented that everyone's afraid of
> fixing it.

do_notify_resume() can call do_signal(), which can sleep, after
which all bets are off on what new flags may have been set.

On the other hand, we have stuff that can run just fine with
irqs disabled that we really want to call only once.

For that reason, I suspect we need two functions.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/