Re: BUG_ON in rcu_sync_func triggered
From: Oleg Nesterov
Date: Mon Sep 12 2016 - 09:02:18 EST
Hi Nikolay,
On 09/12, Nikolay Borisov wrote:
>
> [ 2213.610208] ------------[ cut here ]------------
> [ 2213.614243] kernel BUG at kernel/rcu/sync.c:152!
> [ 2213.618270] invalid opcode: 0000 [#1] SMP
> [ 2213.696629] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.4.2-clouder2 #1
> [ 2213.702891] Hardware name: Supermicro X9DRW/X9DRW, BIOS 1.0b 10/11/2012
> [ 2213.709155] task: ffff880276a24e00 ti: ffff880276a40000 task.ti: ffff880276a40000
> [ 2213.716391] RIP: 0010:[<ffffffff810a4af0>] [<ffffffff810a4af0>] rcu_sync_func+0xa0/0xb0
> [ 2213.724407] RSP: 0018:ffff88047fc83da8 EFLAGS: 00010297
> [ 2213.729211] RAX: ffffffff810a4a50 RBX: ffff88025c6b22a0 RCX: 0000000180200018
> [ 2213.736056] RDX: 0000000180200019 RSI: ffffea0011ca1000 RDI: ffff88025c6b22a0
> [ 2213.742903] RBP: ffff88047fc83dd8 R08: ffffea0011ca1010 R09: 0000000000000000
> [ 2213.749750] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000009
> [ 2213.756597] R13: ffff88025c6b2278 R14: 0000000000000000 R15: ffff880276a40008
> [ 2213.763445] FS: 0000000000000000(0000) GS:ffff88047fc80000(0000) knlGS:0000000000000000
> [ 2213.777149] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2213.782533] CR2: 0000000003a04028 CR3: 0000000001c0a000 CR4: 00000000000406e0
> [ 2213.789379] Stack:
> [ 2213.790491] 0000000000000008 000000000000000a ffff88047fc83dd8 ffff8804749ec600
> [ 2213.797971] 0000000000000009 000000000000000a ffff88047fc83ec8 ffffffff810a9878
> [ 2213.805457] ffff88047fc83de0 ffff88047fc8d178 ffff88047fc95580 0000000000000000
> [ 2213.812931] Call Trace:
> [ 2213.814519] <IRQ>
> [ 2213.815299] [<ffffffff810a9878>] rcu_process_callbacks+0x2a8/0x720
> [ 2213.821704] [<ffffffff8108be0d>] ? run_rebalance_domains+0x18d/0x290
> [ 2213.827776] [<ffffffff81058ee0>] __do_softirq+0x120/0x320
> [ 2213.832780] [<ffffffff810ae803>] ? hrtimer_interrupt+0x113/0x1e0
> [ 2213.838464] [<ffffffff810591b5>] irq_exit+0x75/0x80
> [ 2213.842885] [<ffffffff816394ba>] smp_apic_timer_interrupt+0x4a/0x60
> [ 2213.848857] [<ffffffff81637c29>] apic_timer_interrupt+0x89/0x90
> [ 2213.854437] <EOI>
> [ 2213.855217] [<ffffffff8152d912>] ? cpuidle_enter_state+0x152/0x2c0
> [ 2213.861620] [<ffffffff8152d907>] ? cpuidle_enter_state+0x147/0x2c0
> [ 2213.867495] [<ffffffff8107cb0d>] ? ttwu_do_wakeup+0x1d/0xe0
> [ 2213.872689] [<ffffffff8152da97>] cpuidle_enter+0x17/0x20
> [ 2213.877594] [<ffffffff81094188>] cpu_startup_entry+0x248/0x390
> [ 2213.883081] [<ffffffff810ba802>] ? clockevents_register_device+0x102/0x160
> [ 2213.889736] [<ffffffff810ba550>] ? clockevents_config+0x70/0xa0
> [ 2213.895320] [<ffffffff810ba88c>] ? clockevents_config_and_register+0x2c/0x40
> [ 2213.902169] [<ffffffff81034ec9>] start_secondary+0xf9/0x100
> [ 2213.933638] RIP [<ffffffff810a4af0>] rcu_sync_func+0xa0/0xb0
> [ 2213.939118] RSP <ffff88047fc83da8>
>
>
> The bug on in question is this: BUG_ON(rsp->gp_state != GP_PASSED);
>
> Have you seen something like that before - the kernel is fairly old 4.4.2,
No... thanks, I'll try to look tomorrow.
Oleg.