Re: [PATCH?] Livelock in pick_next_task_fair() / idle_balance()

From: Mike Galbraith
Date: Fri Jul 03 2015 - 00:42:51 EST


On Fri, 2015-07-03 at 02:42 +0800, Yuyang Du wrote:

> But still, I think, even with the above, in idle balancing, pulling until the source
> rq's nr_running == 1 is not just "a short term fix", but should be there permanently
> acting like a last guard with no overhead, why not.

Yeah, seems so. Searching for steal all samples...
(this is all with autogroup)

load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 3 imb: 23 det_tasks: 2 det_load: 3 zeros: 1
load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 0 imb: 32 det_tasks: 2 det_load: 0 zeros: 2
load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 1 imb: 17 det_tasks: 2 det_load: 1 zeros: 1
load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 37 imb: 22 det_tasks: 2 det_load: 37 zeros: 1
load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 0 imb: 102 det_tasks: 2 det_load: 0 zeros: 2


load_balance: idle - s_run: 0 d_run: 1 s_load: 0 d_load: 93 imb: 47 det_tasks: 1 det_load: 93 zeros: 0
load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 202 imb: 125 det_tasks: 2 det_load: 202 zeros: 0
load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 243 imb: 188 det_tasks: 2 det_load: 243 zeros: 0
load_balance: idle - s_run: 0 d_run: 1 s_load: 0 d_load: 145 imb: 73 det_tasks: 1 det_load: 145 zeros: 0
load_balance: idle - s_run: 0 d_run: 1 s_load: 0 d_load: 46 imb: 24 det_tasks: 1 det_load: 46 zeros: 0

Both varieties of total pilferage (w/wo 0 load tasks involved) seem to
happen only during idle balance, never periodic (yet).

Oddity: make -j8 occasionally stacks/pulls piles of load=dinky.

homer:/sys/kernel/debug/tracing # for i in `seq 1 10`; do cat trace|grep "s_run: 1.*det_tasks: $i.*zeros: 0"|wc -l; done
71634
1567
79
15
1
3
0
2
3
0
homer:/sys/kernel/debug/tracing # cat trace|grep "s_run: 1.*det_tasks: 8.*zeros: 0"
<idle>-0 [002] dNs. 594.973783: load_balance: norm - s_run: 1 d_run: 9 s_load: 67 d_load: 1110 imb: 86 det_tasks: 8 det_load: 86 zeros: 0
<...>-10367 [007] d... 1456.477281: load_balance: idle - s_run: 1 d_run: 8 s_load: 805 d_load: 22 imb: 45 det_tasks: 8 det_load: 22 zeros: 0
homer:/sys/kernel/debug/tracing # cat trace|grep "s_run: 1.*det_tasks: 9.*zeros: 0"
<...>-23317 [004] d... 486.677925: load_balance: idle - s_run: 1 d_run: 9 s_load: 888 d_load: 27 imb: 47 det_tasks: 9 det_load: 27 zeros: 0
<...>-11485 [002] d... 573.411095: load_balance: idle - s_run: 1 d_run: 9 s_load: 124 d_load: 78 imb: 82 det_tasks: 9 det_load: 78 zeros: 0
<...>-23286 [000] d... 1510.378740: load_balance: idle - s_run: 1 d_run: 9 s_load: 102 d_load: 58 imb: 63 det_tasks: 9 det_load: 58 zeros: 0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/