Re: [tip:x86/asm] x86/entry/32: Switch INT80 to the new C syscall path

From: Andy Lutomirski
Date: Fri Oct 16 2015 - 11:59:53 EST


On Fri, Oct 16, 2015 at 3:52 AM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Thu, Oct 15, 2015 at 12:09:16PM -0700, Andy Lutomirski wrote:
>> On Thu, Oct 15, 2015 at 11:09 AM, Borislav Petkov <bp@xxxxxxxxx> wrote:
>> > On Fri, Oct 09, 2015 at 06:12:44AM -0700, tip-bot for Andy Lutomirski wrote:
>> >> Commit-ID: 150ac78d63afb96360dab448b7b4d33c98c8266c
>> >> Gitweb: http://git.kernel.org/tip/150ac78d63afb96360dab448b7b4d33c98c8266c
>> >> Author: Andy Lutomirski <luto@xxxxxxxxxx>
>> >> AuthorDate: Mon, 5 Oct 2015 17:48:14 -0700
>> >> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
>> >> CommitDate: Fri, 9 Oct 2015 09:41:10 +0200
>> >>
>> >> x86/entry/32: Switch INT80 to the new C syscall path
>> >>
>> >> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> >> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
>> >> Cc: Borislav Petkov <bp@xxxxxxxxx>
>> >> Cc: Brian Gerst <brgerst@xxxxxxxxx>
>> >> Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
>> >> Cc: H. Peter Anvin <hpa@xxxxxxxxx>
>> >> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>> >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> >> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> >> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> >> Link: http://lkml.kernel.org/r/a7e8d8df96838eae3208dd0441023f3ce7a81831.1444091585.git.luto@xxxxxxxxxx
>> >> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
>> >> ---
>> >> arch/x86/entry/entry_32.S | 34 +++++++++++++---------------------
>> >> 1 file changed, 13 insertions(+), 21 deletions(-)
>> >
>> > Just triggered this here on rc5+tip/master, 32-bit. Any ideas?
>> >
>> > ------------[ cut here ]------------
>> > WARNING: CPU: 1 PID: 1 at /mnt/kernel/kernel/linux-2.6/kernel/locking/lockdep.c:2639 trace_hardirqs_off_caller+0xef/0x150()
>> > DEBUG_LOCKS_WARN_ON(!irqs_disabled())
>> > Modules linked in:
>> >
>> > CPU: 1 PID: 1 Comm: init Tainted: G W 4.3.0-rc5+ #1
>> > Hardware name: LENOVO 30515QG/30515QG, BIOS 8RET30WW (1.12 ) 09/15/2011
>> > 00000000 00000000 f44fbf34 c1301072 f44fbf74 f44fbf64 c105658d c1819094
>> > f44fbf90 00000001 c181f838 00000a4f c10a284f c10a284f f4520000 c1662048
>> > 00000009 f44fbf7c c10565f3 00000009 f44fbf74 c1819094 f44fbf90 f44fbf9c
>> > Call Trace:
>> > [<c1301072>] dump_stack+0x4b/0x79
>> > [<c105658d>] warn_slowpath_common+0x8d/0xc0
>> > [<c10a284f>] ? trace_hardirqs_off_caller+0xef/0x150
>> > [<c10a284f>] ? trace_hardirqs_off_caller+0xef/0x150
>> > [<c1662048>] ? entry_INT80_32+0x28/0x2f
>>
>> Can you turn that entry_INT80_32 address into either a line number of
>> some assembly code? I'm not seeing the code path that could do this,
>> and there are two unlikely choices.
>
> Why, that's the TRACE_IRQS_OFF at the end of entry_INT80_32. It calls
> trace_hardirqs_off_caller through the thunk. That's pretty obvious. Or
> am I misunderstanding you?

I was thinking it could also be TRACE_IRQS_IRETQ, but I was wrong
(that would be trace_hardirqs_on_caller).

>
> c1662020 <entry_INT80_32>:
> c1662020: 90 nop
> c1662021: 90 nop
> c1662022: 90 nop
> c1662023: 50 push %eax
> c1662024: fc cld
> c1662025: 6a 00 push $0x0
> c1662027: 0f a0 push %fs
> c1662029: 06 push %es
> c166202a: 1e push %ds
> c166202b: 6a da push $0xffffffda
> c166202d: 55 push %ebp
> c166202e: 57 push %edi
> c166202f: 56 push %esi
> c1662030: 52 push %edx
> c1662031: 51 push %ecx
> c1662032: 53 push %ebx
> c1662033: ba 7b 00 00 00 mov $0x7b,%edx
> c1662038: 8e da mov %edx,%ds
> c166203a: 8e c2 mov %edx,%es
> c166203c: ba d8 00 00 00 mov $0xd8,%edx
> c1662041: 8e e2 mov %edx,%fs
> c1662043: e8 c8 ee 99 ff call c1000f10 <trace_hardirqs_off_thunk>
> c1662048: 89 e0 mov %esp,%eax
> c166204a: e8 61 f9 99 ff call c10019b0 <do_int80_syscall_32>
>
>
> /*
> * User mode is traced as though IRQs are on, and the interrupt gate
> * turned them off.
> */
> TRACE_IRQS_OFF
>
> Sounds like the gate didn't disable IRQs, right? Or did the
> irqs_disabled() check get tricked into looking at the wrong flags...?
> But I don't see it. Hmmm..

Wow I am incompetent.

set_system_trap_gate(IA32_SYSCALL_VECTOR, entry_INT80_32);

How did I not catch that in testing? Can you change that to
set_system_intr_gate and see if that helps?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/