Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()
From: Peter Zijlstra
Date: Fri Jun 05 2020 - 06:39:28 EST
On Thu, Jun 04, 2020 at 03:54:45PM -0700, Paul E. McKenney wrote:
> Hello!
>
> I get the splat below at a rate of roughly two per thirty hours when
> running rcutorture scenario TREE03 on x86 at the June 3rd mainline commit:
>
> cb8e59cc8720 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")
>
> Running 140 hours of this same scenario at the following June 2nd mainline
> commit shows no errors:
>
> d9afbb350990 ("Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security")
>
> I have started a bisection, but it is likely to take several days to
> complete. I am looking at ways of speeding this up, but in the meantime,
> I figured that I should check to see if others are also encountering this.
>
> Thoughts?
I think this shows there's a boo-boo with the IPI patches. I've not
managed to reproduce, but I'll give them another hard look.
Would you have a .config for me? My compiler's check_preempt_wakeup
isn't anywhere near 0x180 bytes long. I'm thiknig you have
instrumentation enabled, KCSAN?
> BUG: kernel NULL pointer dereference, address: 0000000000000150
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 9 PID: 196 Comm: rcu_torture_rea Not tainted 5.7.0+ #3923
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 04/01/2014
> RIP: 0010:check_preempt_wakeup+0xb1/0x180
> Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 48 85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 48 39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94
> RSP: 0018:ffffaccdc02ecd38 EFLAGS: 00010006
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffafa0bc20
> RDX: 0000000000000000 RSI: ffff946b5df50000 RDI: ffff946b5f469340
> RBP: 0000000000000000 R08: ffff946b5df80d00 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff946b5f469300
> R13: 0000000000000001 R14: ffff946b5df80d00 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff946b5f440000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000150 CR3: 0000000016e0a000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> check_preempt_curr+0x5d/0x90
> ttwu_do_wakeup.isra.93+0xf/0x100
> sched_ttwu_pending+0x66/0x90
> smp_call_function_single_interrupt+0x2d/0xf0
> call_function_single_interrupt+0xf/0x20
Right, so I frobbed at that recently, see:
a148866489fbe243c936fe43e4525d8dbfa0318f...19a1f5ec699954d21be10f74ff71c2a7079e99ad