Re: [PATCH] Avoid moving tasks when a schedule can be made.

From: Peter Williams
Date: Wed Feb 01 2006 - 20:24:25 EST


Steven Rostedt wrote:
On Wed, 2006-02-01 at 14:36 +1100, Peter Williams wrote:

Steven Rostedt wrote:

I found this in the -rt kernel. While running "hackbench 20" I hit
latencies of over 1.5 ms. That is huge! This latency was created by
the move_tasks function in sched.c to rebalance the queues over CPUS.

There currently isn't any check in this function to see if it should
stop, thus a large number of tasks can drive the latency high.

With the below patch, (tested on -rt with latency tracing), the latency
caused by hackbench disappeared below my notice threshold (100 usecs).

I'm not convinced that this bail out is in the right location, but it
worked where it is. Comments are welcome.



I presume that the intention here is to allow a newly woken task that preempts the current task to stop the load balancing?

As I see it (and I may be wrong), for this to happen, the task must have woken before the run queue locks were taken (otherwise it wouldn't have got as far as activation) i.e. before move_tasks() is called and therefore you may as well just do this check at the start of move_tasks().


Actually, one of the tasks that was moved might need to resched right
away, since it preempts the current task that is doing the moving.

Good point but I'd say that this was an instance when you didn't want to bail out of the load balance. E.g. during idle balancing the very first task moved would trigger it regardless of its priority. Also, if the task was of sufficiently high priority for it to warrant bailing out of the load balance why wasn't the current task (i.e. why didn't it preempt on its current CPU).



However, a newly woken task that preempts the current task isn't the only way that needs_resched() can become true just before load balancing is started. E.g. scheduler_tick() calls set_tsk_need_resched(p) when a task finishes a time slice and this patch would cause rebalance_tick() to be aborted after a lot of work has been done in this case.


No real work is lost. This is a loop that individually pulls tasks. So
the bail only stops the work of looking for more tasks to pull and we
don't lose the tasks that have already been pulled.

I disagree. A bunch of work is done to determine which CPU to pull from and how many tasks to pull and then it will bail out before any of them are moved (and for no good reason).



In summary, I think that the bail out is badly placed and needs some way of knowing if the reason need_resched() has become true is because of preemption of a newly woken task and not some other reason.


I need that bail in the loop, so it can stop if needed. Like I said, it
can be a task that is pulled to cause the bail. Also, having the run
queue locks and interrupts off for over a msec is really a bad idea.

Clearly, the way to handle this is to impose some limit on the number of tasks to be moved or split large moves into a number of smaller moves (releasing and reacquiring locks in between). This could be done in the bits of code that set things up before move_tasks() is called.



Peter
PS I've added Nick Piggin to the CC list as he is interested in load balancing issues.


Thanks, and thanks for the comments too. I'm up for all suggestions and
ideas. I just feel it is important that we don't have unbounded
latencies of spin locks and interrupts off.

Well, you seem to have succeeded in starting a discussion :-)

Peter
--
Peter Williams pwil3058@xxxxxxxxxxxxxx

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/