Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled

From: Andy Lutomirski
Date: Wed Mar 25 2015 - 11:08:22 EST


On Wed, Mar 25, 2015 at 5:21 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
>> On Tue, Mar 24, 2015 at 1:08 PM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>> >
>> > * Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>> >
>> >> We currently have a race: if we're preempted during syscall exit, we
>> >> can fail to process syscall return work that is queued up while
>> >> we're preempted in ret_from_sys_call after checking ti.flags.
>> >>
>> >> Fix it by disabling interrupts before checking ti.flags.
>> >>
>> >> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
>> >> Reported-by: Stefan Seyfried <stefan.seyfried@xxxxxxxxxxxxxx>
>> >> Reported-by: Takashi Iwai <tiwai@xxxxxxx>
>> >> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> >> ---
>> >>
>> >> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff. Can you take a quick
>> >> look to confirm that it's okay to call it more than once?
>> >
>> > So the essence is that it wants to print this warning if we are
>> > holding a lock after a syscall:
>> >
>> > printk("[ BUG: lock held when returning to user space! ]\n");
>> >
>> > it manipulates no state and is not sensitive to whether it's called
>> > before or after return-work processing.
>> >
>> >> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>> >> 1 file changed, 14 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> >> index 1d74d161687c..2babb393915e 100644
>> >> --- a/arch/x86/kernel/entry_64.S
>> >> +++ b/arch/x86/kernel/entry_64.S
>> >> @@ -364,12 +364,21 @@ system_call_fastpath:
>> >> * Has incomplete stack frame and undefined top of stack.
>> >> */
>> >> ret_from_sys_call:
>> >> - testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> >> - jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> >> -
>> >> LOCKDEP_SYS_EXIT
>> >> DISABLE_INTERRUPTS(CLBR_NONE)
>> >> TRACE_IRQS_OFF
>> >> +
>> >> + /*
>> >> + * We must check ti flags with interrupts (or at least preemption)
>> >> + * off because we must *never* return to userspace without
>> >> + * processing exit work that is enqueued if we're preempted here.
>> >> + * In particular, returning to userspace with any of the one-shot
>> >> + * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
>> >> + * very bad.
>> >> + */
>> >> + testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> >> + jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> >
>> > Should be safe to call it once again after user-work processing has
>> > been finished.
>> >
>> > I've picked up your fix for tip:x86/urgent.
>>
>> FWIW, the tentative merge here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=tmp.tmp&id=a77dd1607ad88a601259a74ba4d646fa68b7cd9a
>>
>> looks funny. Why aren't you jumping to int_ret_from_sys_call_irqs_off?
>
> Indeed - the orphaned label should have told me that. The mismerge is
> functionally harmless (causes extra overhead in the slowpath), that's
> why it passed testing.
>
> Does:
>
> 06ab9c1ba6a1 Merge branch 'x86/urgent' into x86/asm, to resolve conflict
>
> look better to you?

Yes, looks good. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/