Re: sched: tweak select_idle_sibling to look for idle threads

From: Peter Zijlstra
Date: Fri May 06 2016 - 03:25:50 EST


On Tue, May 03, 2016 at 11:11:53AM -0400, Chris Mason wrote:
> # pick a single core, in my case cpus 0,20 are the same core
> # cpu_hog is any program that spins
> #
> taskset -c 20 cpu_hog &
>
> # schbench -p 4 means message passing mode with 4 byte messages (like
> # pipe test), no sleeps, just bouncing as fast as it can.
> #
> # make the scheduler choose between the sibling of the hog and cpu 1
> #
> taskset -c 0,1 schbench -p 4 -m 1 -t 1
>
> Current mainline will stuff both schbench threads onto CPU 1, leaving
> CPU 0 100% idle. My first patch with the minimal task_hot() checks
> would sometimes pick CPU 0. My second patch that just directly calls
> task_hot sticks to cpu1, which is ~3x faster than spreading it.

Ok, with the thing fixed, my current patch seems to DTRT. If I trace
sched_migrate_task() I get:

$ grep schbench trace

doit-schbench-2-4042 [004] d..3 144541.309747: sched_migrate_task: comm=doit-schbench-2 pid=4042 prio=120 orig_cpu=4 dest_cpu=4
doit-schbench-2-4042 [004] d..2 144541.309772: sched_migrate_task: comm=doit-schbench-2 pid=4043 prio=120 orig_cpu=4 dest_cpu=11
doit-schbench-2-4042 [004] d..3 144541.309855: sched_migrate_task: comm=doit-schbench-2 pid=4042 prio=120 orig_cpu=4 dest_cpu=4
doit-schbench-2-4042 [004] d..2 144541.309882: sched_migrate_task: comm=doit-schbench-2 pid=4044 prio=120 orig_cpu=4 dest_cpu=5
migration/11-77 [011] d..4 144541.309974: sched_migrate_task: comm=doit-schbench-2 pid=4043 prio=120 orig_cpu=11 dest_cpu=12
migration/5-40 [005] d..4 144541.310013: sched_migrate_task: comm=doit-schbench-2 pid=4044 prio=120 orig_cpu=5 dest_cpu=6
schbench-4044 [001] d..3 144541.310995: sched_migrate_task: comm=schbench pid=4044 prio=120 orig_cpu=1 dest_cpu=1
schbench-4044 [001] d..2 144541.310999: sched_migrate_task: comm=schbench pid=4045 prio=120 orig_cpu=1 dest_cpu=1
schbench-4045 [001] d..3 144541.311232: sched_migrate_task: comm=schbench pid=4045 prio=120 orig_cpu=1 dest_cpu=1
schbench-4045 [001] d..2 144541.311234: sched_migrate_task: comm=schbench pid=4046 prio=120 orig_cpu=1 dest_cpu=1

So the thing gets put on cpu1 and never leaves.