Re: [PATCH v5 6/7] sched/deadline: Deferrable dl server

From: kernel test robot
Date: Mon Nov 13 2023 - 10:06:24 EST




Hello,

kernel test robot noticed "WARNING:at_kernel/sched/deadline.c:#enqueue_dl_entity" on:

commit: dea46af8e193ed4f23c37123bfd4a825399aedfe ("[PATCH v5 6/7] sched/deadline: Deferrable dl server")
url: https://github.com/intel-lab-lkp/linux/commits/Daniel-Bristot-de-Oliveira/sched-Unify-runtime-accounting-across-classes/20231104-201952
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 984ffb6a4366752c949f7b39640aecdce222607f
patch link: https://lore.kernel.org/all/c7b706d30d6316c52853ca056db5beb82ba72863.1699095159.git.bristot@xxxxxxxxxx/
patch subject: [PATCH v5 6/7] sched/deadline: Deferrable dl server

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429
with following parameters:

runtime: 600s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


compiler: gcc-9
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


the issue does not always happen in our tests (show 9 times out of 20 runs),
but keeps clean on parent.

6f69498ee58c052e dea46af8e193ed4f23c37123bfd
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:20 45% 9:20 dmesg.RIP:enqueue_dl_entity
:20 45% 9:20 dmesg.WARNING:at_kernel/sched/deadline.c:#enqueue_dl_entity




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202311132217.2a9a4aac-oliver.sang@xxxxxxxxx


[ 59.623267][ C0] ------------[ cut here ]------------
[ 59.627229][ C0] WARNING: CPU: 0 PID: 1 at kernel/sched/deadline.c:1803 enqueue_dl_entity (kernel/sched/deadline.c:1803 (discriminator 1))
[ 59.627229][ C0] Modules linked in:
[ 59.627229][ C0] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G T 6.6.0-rc7-00090-gdea46af8e193 #1
[ 59.627229][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 59.627229][ C0] RIP: 0010:enqueue_dl_entity (kernel/sched/deadline.c:1803 (discriminator 1))
[ 59.627229][ C0] Code: 8e 74 ed ff ff 45 84 f6 0f 85 fd 08 00 00 48 8d b5 08 d8 ff ff 48 8d 95 c8 ee ff ff 4c 89 ff e8 40 f8 01 00 e9 50 ed ff ff 90 <0f> 0b 90 e9 9b ec ff ff 48 8d bd 1c d8 ff ff 48 c7 c3 40 fa 1f 00
All code
========
0: 8e 74 ed ff mov -0x1(%rbp,%rbp,8),%?
4: ff 45 84 incl -0x7c(%rbp)
7: f6 0f 85 testb $0x85,(%rdi)
a: fd std
b: 08 00 or %al,(%rax)
d: 00 48 8d add %cl,-0x73(%rax)
10: b5 08 mov $0x8,%ch
12: d8 ff fdivr %st(7),%st
14: ff 48 8d decl -0x73(%rax)
17: 95 xchg %eax,%ebp
18: c8 ee ff ff enterq $0xffee,$0xff
1c: 4c 89 ff mov %r15,%rdi
1f: e8 40 f8 01 00 callq 0x1f864
24: e9 50 ed ff ff jmpq 0xffffffffffffed79
29: 90 nop
2a:* 0f 0b ud2 <-- trapping instruction
2c: 90 nop
2d: e9 9b ec ff ff jmpq 0xffffffffffffeccd
32: 48 8d bd 1c d8 ff ff lea -0x27e4(%rbp),%rdi
39: 48 c7 c3 40 fa 1f 00 mov $0x1ffa40,%rbx

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 90 nop
3: e9 9b ec ff ff jmpq 0xffffffffffffeca3
8: 48 8d bd 1c d8 ff ff lea -0x27e4(%rbp),%rdi
f: 48 c7 c3 40 fa 1f 00 mov $0x1ffa40,%rbx
[ 59.627229][ C0] RSP: 0000:ffffc90000007d28 EFLAGS: 00010092
[ 59.627229][ C0] RAX: dffffc0000000000 RBX: ffff8883aec00418 RCX: 1ffffffff096d168
[ 59.627229][ C0] RDX: 1ffff11075d80078 RSI: 0000000000000020 RDI: ffff8883aec003c0
[ 59.627229][ C0] RBP: ffff8883aec003c0 R08: ffff8883aec004f0 R09: ffff8883aec00500
[ 59.627229][ C0] R10: 0000000000000000 R11: ffffffff873fba8f R12: ffff8883aec00414
[ 59.627229][ C0] R13: 0000000000000020 R14: ffff8883aebffa58 R15: ffff8883aec003c0
[ 59.627229][ C0] FS: 0000000000000000(0000) GS:ffff8883aea00000(0000) knlGS:0000000000000000
[ 59.627229][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 59.627229][ C0] CR2: ffff88843ffff000 CR3: 0000000004e70000 CR4: 00000000000006b0
[ 59.627229][ C0] Call Trace:
[ 59.627229][ C0] <IRQ>
[ 59.627229][ C0] ? __warn (kernel/panic.c:673)
[ 59.627229][ C0] ? enqueue_dl_entity (kernel/sched/deadline.c:1803 (discriminator 1))
[ 59.627229][ C0] ? report_bug (lib/bug.c:201 lib/bug.c:219)
[ 59.627229][ C0] ? handle_bug (arch/x86/kernel/traps.c:237)
[ 59.627229][ C0] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1))
[ 59.627229][ C0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:568)
[ 59.627229][ C0] ? enqueue_dl_entity (kernel/sched/deadline.c:1803 (discriminator 1))
[ 59.627229][ C0] ? update_rq_clock (kernel/sched/core.c:765 kernel/sched/core.c:750)
[ 59.627229][ C0] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91)
[ 59.627229][ C0] ? sched_clock_tick (kernel/sched/clock.c:270 kernel/sched/clock.c:426 kernel/sched/clock.c:412)
[ 59.627229][ C0] dl_task_timer (kernel/sched/deadline.c:1193)
[ 59.627229][ C0] ? pick_task_dl (kernel/sched/deadline.c:1174)
[ 59.627229][ C0] __hrtimer_run_queues (kernel/time/hrtimer.c:1688 kernel/time/hrtimer.c:1752)
[ 59.627229][ C0] ? enqueue_hrtimer (kernel/time/hrtimer.c:1722)
[ 59.627229][ C0] ? kvm_clock_read (arch/x86/include/asm/preempt.h:95 arch/x86/kernel/kvmclock.c:80)
[ 59.627229][ C0] ? ktime_get_update_offsets_now (kernel/time/timekeeping.c:292 (discriminator 4) kernel/time/timekeeping.c:388 (discriminator 4) kernel/time/timekeeping.c:2320 (discriminator 4))
[ 59.627229][ C0] hrtimer_interrupt (kernel/time/hrtimer.c:1817)
[ 59.627229][ C0] __sysvec_apic_timer_interrupt (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:444 include/linux/jump_label.h:260 include/linux/jump_label.h:270 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1081)
[ 59.627229][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
[ 59.627229][ C0] </IRQ>
[ 59.627229][ C0] <TASK>
[ 59.627229][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645)
[ 59.627229][ C0] RIP: 0010:_raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
[ 59.627229][ C0] Code: 83 c7 18 e8 ca 80 74 fd 48 89 ef e8 82 18 75 fd 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 93 55 69 fd 65 8b 05 54 1a 66 7c 85 c0 74 0a 5b 5d c3 e8 30 63
All code
========
0: 83 c7 18 add $0x18,%edi
3: e8 ca 80 74 fd callq 0xfffffffffd7480d2
8: 48 89 ef mov %rbp,%rdi
b: e8 82 18 75 fd callq 0xfffffffffd751892
10: 81 e3 00 02 00 00 and $0x200,%ebx
16: 75 25 jne 0x3d
18: 9c pushfq
19: 58 pop %rax
1a: f6 c4 02 test $0x2,%ah
1d: 75 2d jne 0x4c
1f: 48 85 db test %rbx,%rbx
22: 74 01 je 0x25
24: fb sti
25: bf 01 00 00 00 mov $0x1,%edi
2a:* e8 93 55 69 fd callq 0xfffffffffd6955c2 <-- trapping instruction
2f: 65 8b 05 54 1a 66 7c mov %gs:0x7c661a54(%rip),%eax # 0x7c661a8a
36: 85 c0 test %eax,%eax
38: 74 0a je 0x44
3a: 5b pop %rbx
3b: 5d pop %rbp
3c: c3 retq
3d: e8 .byte 0xe8
3e: 30 .byte 0x30
3f: 63 .byte 0x63

Code starting with the faulting instruction
===========================================
0: e8 93 55 69 fd callq 0xfffffffffd695598
5: 65 8b 05 54 1a 66 7c mov %gs:0x7c661a54(%rip),%eax # 0x7c661a60
c: 85 c0 test %eax,%eax
e: 74 0a je 0x1a
10: 5b pop %rbx
11: 5d pop %rbp
12: c3 retq
13: e8 .byte 0xe8
14: 30 .byte 0x30
15: 63 .byte 0x63
[ 59.627229][ C0] RSP: 0000:ffffc9000001fbb8 EFLAGS: 00000206
[ 59.627229][ C0] RAX: 0000000000000006 RBX: 0000000000000200 RCX: ffffffff812e2631
[ 59.627229][ C0] RDX: 0000000000000000 RSI: ffffffff83eaa940 RDI: 0000000000000001
[ 59.627229][ C0] RBP: ffff88812d07ed00 R08: 0000000000000001 R09: fffffbfff0e7f757
[ 59.627229][ C0] R10: fffffbfff0e7f756 R11: ffffffff873fbab7 R12: 0000000000000000
[ 59.627229][ C0] R13: 0000000000000246 R14: ffff888195a701a8 R15: ffff88812cd23350
[ 59.627229][ C0] ? mark_lock (arch/x86/include/asm/bitops.h:228 (discriminator 3) arch/x86/include/asm/bitops.h:240 (discriminator 3) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 3) kernel/locking/lockdep.c:228 (discriminator 3) kernel/locking/lockdep.c:4655 (discriminator 3))
[ 59.627229][ C0] dma_fence_signal (drivers/dma-buf/dma-fence.c:327 drivers/dma-buf/dma-fence.c:476)
[ 59.627229][ C0] wait_backward (drivers/dma-buf/st-dma-fence-chain.c:621)
[ 59.627229][ C0] ? find_gap (drivers/dma-buf/st-dma-fence-chain.c:603)
[ 59.627229][ C0] ? __cond_resched (kernel/sched/core.c:8521)
[ 59.627229][ C0] __subtests (drivers/dma-buf/selftest.c:106 (discriminator 1))
[ 59.627229][ C0] ? kmem_cache_open (mm/slub.c:2479 mm/slub.c:4232 mm/slub.c:4560)
[ 59.627229][ C0] ? __sanitycheck__ (drivers/dma-buf/selftest.c:92)
[ 59.627229][ C0] ? kmem_cache_create_usercopy (mm/slab_common.c:351)
[ 59.627229][ C0] dma_fence_chain (drivers/dma-buf/st-dma-fence-chain.c:708)
[ 59.627229][ C0] st_init (drivers/dma-buf/selftest.c:141 drivers/dma-buf/selftest.c:155)
[ 59.627229][ C0] ? udmabuf_dev_init (drivers/dma-buf/selftest.c:154)
[ 59.627229][ C0] do_one_initcall (init/main.c:1232)
[ 59.627229][ C0] ? trace_event_raw_event_initcall_level (init/main.c:1223)
[ 59.627229][ C0] ? parameq (kernel/params.c:171)
[ 59.627229][ C0] ? strcpy (lib/string.c:83 (discriminator 1))
[ 59.627229][ C0] kernel_init_freeable (init/main.c:1293 init/main.c:1310 init/main.c:1329 init/main.c:1547)
[ 59.627229][ C0] ? finish_task_switch (arch/x86/include/asm/irqflags.h:42 arch/x86/include/asm/irqflags.h:77 kernel/sched/sched.h:1390 kernel/sched/core.c:5129 kernel/sched/core.c:5247)
[ 59.627229][ C0] ? rest_init (init/main.c:1429)
[ 59.627229][ C0] kernel_init (init/main.c:1439)
[ 59.627229][ C0] ? _raw_spin_unlock_irq (arch/x86/include/asm/preempt.h:104 include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:202)
[ 59.627229][ C0] ret_from_fork (arch/x86/kernel/process.c:153)
[ 59.627229][ C0] ? rest_init (init/main.c:1429)
[ 59.627229][ C0] ret_from_fork_asm (arch/x86/entry/entry_64.S:312)
[ 59.627229][ C0] </TASK>
[ 59.627229][ C0] irq event stamp: 842784
[ 59.627229][ C0] hardirqs last enabled at (842783): irqentry_exit (kernel/entry/common.c:436)
[ 59.627229][ C0] hardirqs last disabled at (842784): sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074)
[ 59.627229][ C0] softirqs last enabled at (842772): __do_softirq (arch/x86/include/asm/preempt.h:27 kernel/softirq.c:400 kernel/softirq.c:582)
[ 59.627229][ C0] softirqs last disabled at (842761): irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:622 kernel/softirq.c:644)
[ 59.627229][ C0] ---[ end trace 0000000000000000 ]---


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231113/202311132217.2a9a4aac-oliver.sang@xxxxxxxxx



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki