Re: [PATCH v3] sched/deadline: fix earliest_dl.next logic

From: Wanpeng Li
Date: Sun Nov 29 2015 - 21:21:02 EST


2015-11-27 20:14 GMT+08:00 Luca Abeni <luca.abeni@xxxxxxxx>:
> Hi all,
>
> I ran some quick tests on this patch (because I was working on something
> related), and it seems to me that it triggers a bug. Here are some
> information:
> - I applied the patch to git master (to be sure that I did not corrupt the
> patch,
> I extracted it again with git format-patch, and posted it here:
>
> http://disi.unitn.it/~abeni/H-Test/0001-sched-deadline-fix-earliest_dl.next-logic.patch
> )
> - I tried a simple test program creating 4 periodic threads and scheduling
> them with
> SCHED_DEADLINE. And I got something like:
> [ 29.055688] ------------[ cut here ]------------
> [ 29.056563] kernel BUG at
> /home/luca/Src/Kernel/linux/kernel/sched/deadline.c:1443!
> [ 29.056563] invalid opcode: 0000 [#1] SMP
> [ 29.056563] Modules linked in:
> [ 29.056563] CPU: 0 PID: 1136 Comm: periodic_thread Not tainted 4.4.0-rc2+
> #3
> [ 29.056563] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> Bochs 01/01/2011
> [ 29.056563] task: ffff880006603d80 ti: ffff8800067f4000 task.ti:
> ffff8800067f4000
> [ 29.056563] RIP: 0010:[<ffffffff8109704a>] [<ffffffff8109704a>]
> pick_next_pushable_dl_task+0x5a/0x60
> [ 29.056563] RSP: 0000:ffff8800067f7e38 EFLAGS: 00010046
> [ 29.056563] RAX: ffff880006603d80 RBX: ffff880006a7ac50 RCX:
> ffff8800067f4000
> [ 29.056563] RDX: ffff8800066040d0 RSI: 0000000000000000 RDI:
> ffff880007a15ac0
> [ 29.056563] RBP: ffff8800067f7e38 R08: 0000000000000000 R09:
> 0000000000000000
> [ 29.056563] R10: ffff880006603d80 R11: ffff880006604548 R12:
> ffff880007a15ac0
> [ 29.056563] R13: ffff880006a7a900 R14: ffff8800066042e0 R15:
> ffffffff81a09cc0
> [ 29.056563] FS: 00007f667e179700(0000) GS:ffff880007a00000(0000)
> knlGS:0000000000000000
> [ 29.056563] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 29.056563] CR2: 00007f71f3c221b8 CR3: 00000000067ae000 CR4:
> 00000000000006f0
> [ 29.056563] Stack:
> [ 29.056563] ffff8800067f7e60 ffffffff81097137 ffff880007a15ac0
> ffff880006a7ab20
> [ 29.056563] ffff880006a7a900 ffff8800067f7e90 ffffffff81098ef1
> ffff880006603d80
> [ 29.056563] ffff880007a15ac0 ffff880007a15ad8 0000000000000000
> ffff8800067f7ee8
> [ 29.056563] Call Trace:
> [ 29.056563] [<ffffffff81097137>] dequeue_pushable_dl_task+0x47/0x90
> [ 29.056563] [<ffffffff81098ef1>] pick_next_task_dl+0x81/0x200
> [ 29.056563] [<ffffffff819521ad>] __schedule+0xb3d/0xd00
> [ 29.056563] [<ffffffff819523a7>] schedule+0x37/0x80
> [ 29.056563] [<ffffffff81050d46>] exit_to_usermode_loop+0x39/0x7f
> [ 29.056563] [<ffffffff81001a76>] prepare_exit_to_usermode+0x46/0x50
> [ 29.056563] [<ffffffff819586a5>] retint_user+0x8/0x20
> [ 29.056563] Code: 3b 87 18 09 00 00 74 23 83 7a 98 01 7e 1b 83 ba f4 fc
> ff ff 01 75 10 8b 92 f8 fc ff ff 85 d2 78 04 0f 0b 31 c0 5d c3 0f 0b 0f 0b
> <0f> 0b 0f 0b 66 90 55 48 89 e5 41 54 49 89 f4 53 48 8b b7 18 09
> [ 29.056563] RIP [<ffffffff8109704a>]
> pick_next_pushable_dl_task+0x5a/0x60
> [ 29.056563] RSP <ffff8800067f7e38>
> [ 29.056563] ---[ end trace 9f49218ec54a0b55 ]---
> - How to reproduce:
> + Download http://disi.unitn.it/~abeni/H-Test/newcore-3.gz
> + Start KVM with:
> qemu-system-x86_64 -initrd newcore-3.gz -machine
> pc-q35-2.0,accel=kvm -smp cpus=2 -kernel bzImage -serial file:qlog.txt
> -append "console=ttyS0"
> + Inside the VM, do
> cd /
> sudo ./periodic_thread
> + You should see a crash in few seconds
> The "periodic_thread" program just creates 4 periodic threads (with
> execution time about
> 10ms and period about 100ms) scheduled with SCHED_DEADLINE (with runtime
> 15ms and period
> 100ms). I can provide the source code if needed.
> - The full kernel log is available here:
> http://disi.unitn.it/~abeni/H-Test/qlog.txt
>
>
> Here is my understanding of the crash:
> - schedule() invokes pick_next_task_dl() which wants to do a context switch
> (by selecting
> for execution a new task "p" which is different from "prev")
> - pick_next_task_dl() invokes put_prev, which puts the "prev" task in the
> pushable tasks
> queue (WARNING! "prev" is still the "current" task in this rq, because the
> scheduler is
> still running... I think this is the source of the issue)
> - then, pick_next_task_dl() invokes dequeue_pushable_dl_task() on p, to
> remove the selected
> task from the pushable tasks queue...
> - ...But after your patch dequeue_pushable_dl_task() invokes
> pick_next_pushable_dl_task().
> Which sees that the next pushable task is the "current" task (see above).
> This happens
> becuase "prev" has already been inserted in the pushable tasks queue, and
> can be the
> next pushable task... But "current" has not been updated yet.
> - The BUG_ON() at line 1443 of deadline.c is just "BUG_ON(task_current(rq,
> p))"

Thanks for your report and great analyse, Luca, you are right.

>
> Summing up, I think pick_next_pushable_dl_task() cannot be called from
> dequeue_pushable_dl_task() (at least, not without removing or modifying that
> BUG_ON()).

Juri, how about remove this BUG_ON() just like rt class?

Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/