Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

From: peterz
Date: Wed Aug 12 2020 - 04:22:03 EST


On Wed, Aug 12, 2020 at 10:06:50AM +0200, Marco Elver wrote:
> On Tue, Aug 11, 2020 at 10:17PM +0200, peterz@xxxxxxxxxxxxx wrote:
> > On Tue, Aug 11, 2020 at 11:46:51AM +0200, peterz@xxxxxxxxxxxxx wrote:
> >
> > > So let me once again see if I can't find a better solution for this all.
> > > Clearly it needs one :/
> >
> > So the below boots without triggering the debug code from Marco -- it
> > should allow nesting local_irq_save/restore under raw_local_irq_*().
> >
> > I tried unconditional counting, but there's some _reallly_ wonky /
> > asymmetric code that wrecks that and I've not been able to come up with
> > anything useful.
> >
> > This one starts counting when local_irq_save() finds it didn't disable
> > IRQs while lockdep though it did. At that point, local_irq_restore()
> > will decrement and enable things again when it reaches 0.
> >
> > This assumes local_irq_save()/local_irq_restore() are nested sane, which
> > is mostly true.
> >
> > This leaves #PF, which I fixed in these other patches, but I realized it
> > needs fixing for all architectures :-( No bright ideas there yet.
> >
> > ---
> > arch/x86/entry/thunk_32.S | 5 ----
> > include/linux/irqflags.h | 45 +++++++++++++++++++-------------
> > init/main.c | 16 ++++++++++++
> > kernel/locking/lockdep.c | 58 +++++++++++++++++++++++++++++++++++++++++
> > kernel/trace/trace_preemptirq.c | 33 +++++++++++++++++++++++
> > 5 files changed, 134 insertions(+), 23 deletions(-)
>
> Testing this again with syzkaller produced some new reports:
>
> BUG: stack guard page was hit in error_entry
> BUG: stack guard page was hit in exc_int3
> PANIC: double fault in error_entry
> PANIC: double fault in exc_int3
>
> Most of them have corrupted reports, but this one might be useful:
>
> BUG: stack guard page was hit at 000000001fab0982 (stack is 00000000063f33dc..00000000bf04b0d8)
> BUG: stack guard page was hit at 00000000ca97ac69 (stack is 00000000af3e6c84..000000001597e1bf)
> kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP
> CPU: 1 PID: 4709 Comm: kworker/1:1H Not tainted 5.8.0+ #5
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
> Workqueue: events_highpri snd_vmidi_output_work
> RIP: 0010:exc_int3+0x5/0xf0 arch/x86/kernel/traps.c:636
> Code: c9 85 4d 89 e8 31 c0 e8 a9 7d 68 fd e9 90 fe ff ff e8 0f 35 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 53 48 89 fb <e8> 76 0e 00 00 85 c0 74 03 5b 5d c3 f6 83 88 00 00 00 03 74 7e 48
> RSP: 0018:ffffc90008114000 EFLAGS: 00010083
> RAX: 0000000084e00e17 RBX: ffffc90008114018 RCX: ffffffff84e00e17
> RDX: 0000000000000000 RSI: ffffffff84e00a39 RDI: ffffc90008114018
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88807dc80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffc90008113ff8 CR3: 000000002dae4006 CR4: 0000000000770ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 00000000
> Call Trace:
> asm_exc_int3+0x31/0x40 arch/x86/include/asm/idtentry.h:537
> RIP: 0010:arch_static_branch include/trace/events/preemptirq.h:40 [inline]
> RIP: 0010:static_key_false include/linux/jump_label.h:200 [inline]
> RIP: 0010:trace_irq_enable_rcuidle+0xd/0x120 include/trace/events/preemptirq.h:40
> Code: 24 08 48 89 df e8 43 8d ef ff 48 89 df 5b e9 4a 2e 99 03 66 2e 0f 1f 84 00 00 00 00 00 55 41 56 53 48 89 fb e8 84 1a fd ff cc <1f> 44 00 00 5b 41 5e 5d c3 65 8b 05 ab 74 c3 7e 89 c0 31 f6 48 0f
> RSP: 0018:ffffc900081140f8 EFLAGS: 00000093
> RAX: ffffffff813d9e8c RBX: ffffffff81314dd3 RCX: ffff888076ce6000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81314dd3
> RBP: 0000000000000000 R08: ffffffff813da3d4 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000082 R14: 0000000000000000 R15: ffff888076ce6000
> trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40
> trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40
> trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40
> trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
>
> <... repeated many many times ...>
>
> trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40
> trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106
> rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074
> Lost 500 message(s)!
> BUG: stack guard page was hit at 00000000cab483ba (stack is 00000000b1442365..00000000c26f9ad3)
> BUG: stack guard page was hit at 00000000318ff8d8 (stack is 00000000fd87d656..0000000058100136)
> ---[ end trace 4157e0bb4a65941a ]---

Wheee... recursion! Let me try and see if I can make something of that.