[syzbot] INFO: rcu detected stall in batadv_tt_purge (2)

From: syzbot
Date: Mon Jul 26 2021 - 07:11:28 EST


Hello,

syzbot found the following issue on:

HEAD commit: 7b6ae471e541 Merge tag 'spi-fix-v5.14-rc2' of git://git.ke..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17849f74300000
kernel config: https://syzkaller.appspot.com/x/.config?x=f7dfeb6dfc05ea19
dashboard link: https://syzkaller.appspot.com/bug?extid=08504d998fc4589919ac
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12fc675a300000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1566d2ea300000

The issue was bisected to:

commit b5b73b26b3ca34574124ed7ae9c5ba8391a7f176
Author: Vinicius Costa Gomes <vinicius.gomes@xxxxxxxxx>
Date: Thu Sep 10 00:03:11 2020 +0000

taprio: Fix allowing too small intervals

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1468a696300000
final oops: https://syzkaller.appspot.com/x/report.txt?x=1668a696300000
console output: https://syzkaller.appspot.com/x/log.txt?x=1268a696300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+08504d998fc4589919ac@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals")

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-...!: (2 ticks this GP) idle=4de/1/0x4000000000000000 softirq=11238/11238 fqs=0
(t=13326 jiffies g=13217 q=742)
rcu: rcu_preempt kthread timer wakeup didn't happen for 13325 jiffies! g13217 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: Possible timer handling issue on cpu=0 timer-softirq=29073
rcu: rcu_preempt kthread starved for 13326 jiffies! g13217 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:I stack:29096 pid: 14 ppid: 2 flags:0x00004000
Call Trace:
context_switch kernel/sched/core.c:4683 [inline]
__schedule+0x93a/0x26f0 kernel/sched/core.c:5940
schedule+0xd3/0x270 kernel/sched/core.c:6019
schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1878
rcu_gp_fqs_loop kernel/rcu/tree.c:1996 [inline]
rcu_gp_kthread+0xd34/0x1980 kernel/rcu/tree.c:2169
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
rcu: Stack dump where RCU GP kthread last ran:
NMI backtrace for cpu 0
CPU: 0 PID: 10 Comm: kworker/u4:1 Not tainted 5.14.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_tt_purge
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_check_gp_kthread_starvation.cold+0x1d1/0x1d6 kernel/rcu/tree_stall.h:479
print_cpu_stall kernel/rcu/tree_stall.h:623 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:700 [inline]
rcu_pending kernel/rcu/tree.c:3922 [inline]
rcu_sched_clock_irq.cold+0x9a/0x747 kernel/rcu/tree.c:2641
update_process_times+0x16d/0x200 kernel/time/timer.c:1782
tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1421
__run_hrtimer kernel/time/hrtimer.c:1537 [inline]
__hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1601
hrtimer_interrupt+0x330/0xa00 kernel/time/hrtimer.c:1663
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
__sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1106
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:__local_bh_enable_ip+0xa8/0x120 kernel/softirq.c:390
Code: 1d ad 91 bc 7e 65 8b 05 a6 91 bc 7e a9 00 ff ff 00 74 45 bf 01 00 00 00 e8 15 2a 09 00 e8 50 84 35 00 fb 65 8b 05 88 91 bc 7e <85> c0 74 58 5b 5d c3 65 8b 05 d6 98 bc 7e 85 c0 75 a2 0f 0b eb 9e
RSP: 0018:ffffc90000f0fc10 EFLAGS: 00000206
RAX: 0000000080000000 RBX: 00000000fffffe00 RCX: 1ffffffff1fa584a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffff88ca93e5 R08: 0000000000000001 R09: ffffffff8fcd7987
R10: 0000000000000001 R11: 0000000000000000 R12: ffffc90000f0fdb0
R13: ffff888042de12f8 R14: 0000000000000082 R15: dffffc0000000000
spin_unlock_bh include/linux/spinlock.h:399 [inline]
batadv_tt_local_purge+0x285/0x370 net/batman-adv/translation-table.c:1369
batadv_tt_purge+0x2c/0xaf0 net/batman-adv/translation-table.c:3591
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
NMI backtrace for cpu 0
CPU: 0 PID: 10 Comm: kworker/u4:1 Not tainted 5.14.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_tt_purge
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:342
print_cpu_stall kernel/rcu/tree_stall.h:625 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:700 [inline]
rcu_pending kernel/rcu/tree.c:3922 [inline]
rcu_sched_clock_irq.cold+0x9f/0x747 kernel/rcu/tree.c:2641
update_process_times+0x16d/0x200 kernel/time/timer.c:1782
tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1421
__run_hrtimer kernel/time/hrtimer.c:1537 [inline]
__hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1601
hrtimer_interrupt+0x330/0xa00 kernel/time/hrtimer.c:1663
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
__sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1106
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:__local_bh_enable_ip+0xa8/0x120 kernel/softirq.c:390
Code: 1d ad 91 bc 7e 65 8b 05 a6 91 bc 7e a9 00 ff ff 00 74 45 bf 01 00 00 00 e8 15 2a 09 00 e8 50 84 35 00 fb 65 8b 05 88 91 bc 7e <85> c0 74 58 5b 5d c3 65 8b 05 d6 98 bc 7e 85 c0 75 a2 0f 0b eb 9e
RSP: 0018:ffffc90000f0fc10 EFLAGS: 00000206
RAX: 0000000080000000 RBX: 00000000fffffe00 RCX: 1ffffffff1fa584a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffff88ca93e5 R08: 0000000000000001 R09: ffffffff8fcd7987
R10: 0000000000000001 R11: 0000000000000000 R12: ffffc90000f0fdb0
R13: ffff888042de12f8 R14: 0000000000000082 R15: dffffc0000000000
spin_unlock_bh include/linux/spinlock.h:399 [inline]
batadv_tt_local_purge+0x285/0x370 net/batman-adv/translation-table.c:1369
batadv_tt_purge+0x2c/0xaf0 net/batman-adv/translation-table.c:3591
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 9018 Comm: kworker/u4:2 Not tainted 5.14.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:__sanitizer_cov_trace_const_cmp4+0xc/0x70 kernel/kcov.c:283
Code: 00 00 00 48 89 7c 30 e8 48 89 4c 30 f0 4c 89 54 d8 20 48 89 10 5b c3 0f 1f 80 00 00 00 00 41 89 f8 bf 03 00 00 00 4c 8b 14 24 <89> f1 65 48 8b 34 25 00 f0 01 00 e8 e4 ee ff ff 84 c0 74 4b 48 8b
RSP: 0018:ffffc90000fd8df8 EFLAGS: 00000046
RAX: 0000000000010002 RBX: ffff8880b9d424c0 RCX: 0000000000000000
RDX: ffff88802e756280 RSI: 0000000000000001 RDI: 0000000000000003
RBP: ffff88802539bb40 R08: 0000000000000007 R09: ffffffff904c64ab
R10: ffffffff816467fb R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000000 R14: ffff8880b9d423c0 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000600 CR3: 000000000b68e000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
cpu_max_bits_warn include/linux/cpumask.h:108 [inline]
cpumask_check include/linux/cpumask.h:115 [inline]
cpumask_test_cpu include/linux/cpumask.h:344 [inline]
cpu_online include/linux/cpumask.h:895 [inline]
trace_hrtimer_start include/trace/events/timer.h:195 [inline]
debug_activate kernel/time/hrtimer.c:476 [inline]
enqueue_hrtimer+0x4b/0x3e0 kernel/time/hrtimer.c:982
__run_hrtimer kernel/time/hrtimer.c:1554 [inline]
__hrtimer_run_queues+0xb02/0xe50 kernel/time/hrtimer.c:1601
hrtimer_interrupt+0x330/0xa00 kernel/time/hrtimer.c:1663
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
__sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1106
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:csd_lock_wait kernel/smp.c:440 [inline]
RIP: 0010:smp_call_function_many_cond+0x452/0xc20 kernel/smp.c:967
Code: 0b 00 85 ed 74 4d 48 b8 00 00 00 00 00 fc ff df 4d 89 f4 4c 89 f5 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 e8 c0 48 0b 00 f3 90 <41> 0f b6 04 24 40 38 c5 7c 08 84 c0 0f 85 33 06 00 00 8b 43 08 31
RSP: 0018:ffffc900037b7a00 EFLAGS: 00000293
RAX: 0000000000000000 RBX: ffff8880b9c55de0 RCX: 0000000000000000
RDX: ffff88802e756280 RSI: ffffffff816975f0 RDI: 0000000000000003
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff81697616 R11: 0000000000000000 R12: ffffed101738abbd
R13: 0000000000000000 R14: ffff8880b9c55de8 R15: 0000000000000001
on_each_cpu_cond_mask+0x56/0xa0 kernel/smp.c:1133
on_each_cpu include/linux/smp.h:71 [inline]
text_poke_sync arch/x86/kernel/alternative.c:929 [inline]
text_poke_bp_batch+0x1b3/0x560 arch/x86/kernel/alternative.c:1114
text_poke_flush arch/x86/kernel/alternative.c:1268 [inline]
text_poke_flush arch/x86/kernel/alternative.c:1265 [inline]
text_poke_finish+0x16/0x30 arch/x86/kernel/alternative.c:1275
arch_jump_label_transform_apply+0x13/0x20 arch/x86/kernel/jump_label.c:145
jump_label_update+0x1d5/0x430 kernel/jump_label.c:830
static_key_disable_cpuslocked+0x152/0x1b0 kernel/jump_label.c:207
static_key_disable+0x16/0x20 kernel/jump_label.c:215
toggle_allocation_gate mm/kfence/core.c:637 [inline]
toggle_allocation_gate+0x185/0x390 mm/kfence/core.c:615
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches