Re: [PATCH v3 09/22] sched: compute runnable load avg in cpu_loadand cpu_avg_load_per_task

From: Alex Shi
Date: Tue Jan 08 2013 - 09:27:03 EST


On 01/07/2013 02:31 AM, Linus Torvalds wrote:
> On Sat, Jan 5, 2013 at 11:54 PM, Alex Shi <alex.shi@xxxxxxxxx> wrote:
>>
>> I just looked into the aim9 benchmark, in this case it forks 2000 tasks,
>> after all tasks ready, aim9 give a signal than all tasks burst waking up
>> and run until all finished.
>> Since each of tasks are finished very quickly, a imbalanced empty cpu
>> may goes to sleep till a regular balancing give it some new tasks. That
>> causes the performance dropping. cause more idle entering.
>
> Sounds like for AIM (and possibly for other really bursty loads), we
> might want to do some load-balancing at wakeup time by *just* looking
> at the number of running tasks, rather than at the load average. Hmm?

Millions thanks for your suggestions! :)

It's worth to try use instant load -- nr_running in waking balancing, I
will try this. but in this case, I tried to print sleeping tasks by
print_task() in sched/debug.c. Find the 2000 tasks were forked on just 2
LCPUs which in different cpu sockets whenever with/without this load avg
patch.

So, I am wondering if it's worth to consider the sleeping tasks' load in
fork/wake balancing. Does anyone consider this in history?

===
print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
{
if (rq->curr == p)
SEQ_printf(m, "R");
+ else if (!p->on_rq)
+ SEQ_printf(m, "S");
else
SEQ_printf(m, " ");
...
@@ -166,13 +170,14 @@ static void print_rq(struct seq_file *m, struct rq
*rq, int rq_cpu)
read_lock_irqsave(&tasklist_lock, flags);

do_each_thread(g, p) {
- if (!p->on_rq || task_cpu(p) != rq_cpu)
+ if (task_cpu(p) != rq_cpu)
continue;
===

>
> The load average is fundamentally always going to run behind a bit,
> and while you want to use it for long-term balancing, a short-term you
> might want to do just a "if we have a huge amount of runnable
> processes, do a load balancing *now*". Where "huge amount" should
> probably be relative to the long-term load balancing (ie comparing the
> number of runnable processes on this CPU right *now* with the load
> average over the last second or so would show a clear spike, and a
> reason for quick action).

Many thanks for suggestion!
Will try it. :)
>
> Linus
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/