[linus:master] [sched, x86] 476e8583ca: WARNING:at_kernel/rcu/update.c:#torture_sched_setaffinity
From: kernel test robot
Date: Mon Nov 25 2024 - 10:16:15 EST
Hello,
by this commit, we see the config has below diff:
--- /pkg/linux/x86_64-randconfig-161-20241120/gcc-12/35772d627b55cc7fb4f33bae57c564a25b3121a9/.config 2024-11-22 17:03:32.458344665 +0800
+++ /pkg/linux/x86_64-randconfig-161-20241120/gcc-12/476e8583ca16eecec0a3a28b6ee7130f4e369389/.config 2024-11-22 17:02:59.440805587 +0800
@@ -121,9 +121,11 @@ CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
# end of BPF subsystem
CONFIG_PREEMPT_BUILD=y
-CONFIG_PREEMPT_NONE=y
+CONFIG_ARCH_HAS_PREEMPT_LAZY=y
+# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
+CONFIG_PREEMPT_LAZY=y
# CONFIG_PREEMPT_RT is not set
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y
and we observed the issue does not always happen, but with high rate:
35772d627b55cc7f 476e8583ca16eecec0a3a28b6ee
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:47 70% 33:48 dmesg.RIP:torture_sched_setaffinity
:47 70% 33:48 dmesg.WARNING:at_kernel/rcu/update.c:#torture_sched_setaffinity
below is full report FYI.
kernel test robot noticed "WARNING:at_kernel/rcu/update.c:#torture_sched_setaffinity" on:
commit: 476e8583ca16eecec0a3a28b6ee7130f4e369389 ("sched, x86: Enable Lazy preemption")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master fc39fb56917bb3cb53e99560ca3612a84456ada2]
[test failed on linux-next/master decc701f41d07481893fdea942c0ac6b226e84cd]
in testcase: rcutorture
version:
with following parameters:
runtime: 300s
test: cpuhotplug
torture_type: trivial
config: x86_64-randconfig-161-20241120
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202411252253.e39d77c6-lkp@xxxxxxxxx
[ 150.797530][ T445] ------------[ cut here ]------------
[ 150.797915][ T445] torture_sched_setaffinity: sched_setaffinity(445) returned -22
[ 150.798353][ T445] WARNING: CPU: 0 PID: 445 at kernel/rcu/update.c:535 torture_sched_setaffinity (kernel/rcu/update.c:535 (discriminator 3))
[ 150.798909][ T445] Modules linked in: rcutorture torture ipmi_devintf ipmi_msghandler joydev crc32c_intel polyval_clmulni sha512_ssse3 sha256_ssse3 psmouse pcspkr input_leds led_class evdev drm drm_panel_orientation_quirks efi_pstore i2c_core dmi_sysfs qemu_fw_cfg aesni_intel nhc_udp nhc_routing nhc_mobility nhc_hop nhc_fragment nhc_dest
[ 150.800845][ T445] CPU: 0 UID: 0 PID: 445 Comm: rcu_torture_wri Tainted: G W 6.12.0-rc4-00032-g476e8583ca16 #1 2d4e54818569ec14d7868b657e3682f49dc47408
[ 150.801671][ T445] Tainted: [W]=WARN
[ 150.801886][ T445] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 150.802444][ T445] RIP: 0010:torture_sched_setaffinity (kernel/rcu/update.c:535 (discriminator 3))
[ 150.802909][ T445] Code: 11 06 00 75 e7 89 c1 48 89 45 f0 c6 05 9e 73 11 06 01 90 48 c7 c6 a0 07 0e 85 44 89 e2 48 c7 c7 00 d9 0d 85 e8 88 86 e0 ff 90 <0f> 0b 90 90 48 8b 45 f0 4c 8b 65 f8 c9 48 98 31 d2 31 c9 31 f6 31
All code
========
0: 11 06 adc %eax,(%rsi)
2: 00 75 e7 add %dh,-0x19(%rbp)
5: 89 c1 mov %eax,%ecx
7: 48 89 45 f0 mov %rax,-0x10(%rbp)
b: c6 05 9e 73 11 06 01 movb $0x1,0x611739e(%rip) # 0x61173b0
12: 90 nop
13: 48 c7 c6 a0 07 0e 85 mov $0xffffffff850e07a0,%rsi
1a: 44 89 e2 mov %r12d,%edx
1d: 48 c7 c7 00 d9 0d 85 mov $0xffffffff850dd900,%rdi
24: e8 88 86 e0 ff call 0xffffffffffe086b1
29: 90 nop
2a:* 0f 0b ud2 <-- trapping instruction
2c: 90 nop
2d: 90 nop
2e: 48 8b 45 f0 mov -0x10(%rbp),%rax
32: 4c 8b 65 f8 mov -0x8(%rbp),%r12
36: c9 leave
37: 48 98 cltq
39: 31 d2 xor %edx,%edx
3b: 31 c9 xor %ecx,%ecx
3d: 31 f6 xor %esi,%esi
3f: 31 .byte 0x31
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 90 nop
3: 90 nop
4: 48 8b 45 f0 mov -0x10(%rbp),%rax
8: 4c 8b 65 f8 mov -0x8(%rbp),%r12
c: c9 leave
d: 48 98 cltq
f: 31 d2 xor %edx,%edx
11: 31 c9 xor %ecx,%ecx
13: 31 f6 xor %esi,%esi
15: 31 .byte 0x31
[ 150.803951][ T445] RSP: 0018:ffffc90001d0fc50 EFLAGS: 00010246
[ 150.804295][ T445] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 150.804738][ T445] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 150.805173][ T445] RBP: ffffc90001d0fc60 R08: 0000000000000000 R09: 0000000000000000
[ 150.805664][ T445] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000001bd
[ 150.806126][ T445] R13: ffffed10298bc638 R14: dffffc0000000000 R15: ffff88814c5e2c00
[ 150.806567][ T445] FS: 0000000000000000(0000) GS:ffff8883aee00000(0000) knlGS:0000000000000000
[ 150.807061][ T445] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 150.807425][ T445] CR2: 00007fd03312b000 CR3: 0000000101aa7000 CR4: 00000000000406f0
[ 150.807863][ T445] Call Trace:
[ 150.808055][ T445] <TASK>
[ 150.808230][ T445] ? show_regs (arch/x86/kernel/dumpstack.c:479)
[ 150.808482][ T445] ? __warn (kernel/panic.c:748)
[ 150.808722][ T445] ? torture_sched_setaffinity (kernel/rcu/update.c:535 (discriminator 3))
[ 150.809050][ T445] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 150.809325][ T445] ? handle_bug (arch/x86/kernel/traps.c:285)
[ 150.809631][ T445] ? exc_invalid_op (arch/x86/kernel/traps.c:309 (discriminator 1))
[ 150.809907][ T445] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[ 150.810206][ T445] ? torture_sched_setaffinity (kernel/rcu/update.c:535 (discriminator 3))
[ 150.810536][ T445] synchronize_rcu_trivial (kernel/rcu/rcutorture.c:837) rcutorture
[ 150.811100][ T445] ? debug_smp_processor_id (lib/smp_processor_id.c:61)
[ 150.811407][ T445] ? rcu_torture_read_exit (kernel/rcu/rcutorture.c:832) rcutorture
[ 150.811996][ T445] do_rtws_sync+0x395/0x6e0 rcutorture
[ 150.812579][ T445] ? srcu_read_delay (kernel/rcu/rcutorture.c:1328) rcutorture
[ 150.813133][ T445] ? preempt_count_sub (kernel/sched/core.c:5867 kernel/sched/core.c:5863 kernel/sched/core.c:5885)
[ 150.813443][ T445] ? delay_tsc (arch/x86/include/asm/preempt.h:103 arch/x86/lib/delay.c:96)
[ 150.813697][ T445] rcu_torture_writer (kernel/rcu/rcutorture.c:1578) rcutorture
[ 150.814240][ T445] ? __kthread_parkme (include/linux/instrumented.h:68 include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/kthread.c:280)
[ 150.814524][ T445] ? reacquire_held_locks (kernel/locking/lockdep.c:5476)
[ 150.814841][ T445] ? rcu_torture_fwd_prog_cr (kernel/rcu/rcutorture.c:1364) rcutorture
[ 150.815419][ T445] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/preempt.h:103 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
[ 150.815756][ T445] ? __kthread_parkme (include/linux/instrumented.h:68 include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/kthread.c:280)
[ 150.816076][ T445] ? __kasan_check_read (mm/kasan/shadow.c:32)
[ 150.816368][ T445] ? __kthread_parkme (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 kernel/kthread.c:280)
[ 150.816648][ T445] ? trace_preempt_on (kernel/trace/trace_preemptirq.c:105)
[ 150.816936][ T445] ? rcu_torture_fwd_prog_cr (kernel/rcu/rcutorture.c:1364) rcutorture
[ 150.817541][ T445] kthread (kernel/kthread.c:389)
[ 150.817780][ T445] ? trace_sched_kthread_stop_ret (kernel/kthread.c:342)
[ 150.818127][ T445] ret_from_fork (arch/x86/kernel/process.c:153)
[ 150.818382][ T445] ? trace_sched_kthread_stop_ret (kernel/kthread.c:342)
[ 150.818730][ T445] ret_from_fork_asm (arch/x86/entry/entry_64.S:254)
[ 150.819022][ T445] </TASK>
[ 150.819201][ T445] irq event stamp: 230151
[ 150.819484][ T445] hardirqs last enabled at (230163): __up_console_sem (arch/x86/include/asm/irqflags.h:42 (discriminator 1) arch/x86/include/asm/irqflags.h:97 (discriminator 1) arch/x86/include/asm/irqflags.h:155 (discriminator 1) kernel/printk/printk.c:344 (discriminator 1))
[ 150.820008][ T445] hardirqs last disabled at (230172): __up_console_sem (kernel/printk/printk.c:342 (discriminator 1))
[ 150.820529][ T445] softirqs last enabled at (229700): handle_softirqs (arch/x86/include/asm/preempt.h:26 kernel/softirq.c:401 kernel/softirq.c:582)
[ 150.821054][ T445] softirqs last disabled at (229651): irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637 kernel/softirq.c:627 kernel/softirq.c:649)
[ 150.821585][ T445] ---[ end trace 0000000000000000 ]---
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241125/202411252253.e39d77c6-lkp@xxxxxxxxx
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki