Re: [rfc][patch] sched: remove smpnice

From: Peter Williams
Date: Tue Feb 14 2006 - 17:38:40 EST


Siddha, Suresh B wrote:
On Fri, Feb 10, 2006 at 05:27:06PM -0800, Peter Williams wrote:

"Siddha, Suresh B" <suresh.b.siddha@xxxxxxxxx> wrote:

My testing showed that 178.galgel in SPECfp2000 is down by

~10% when run with

nice -20 on a 4P(8-way with HT) system compared to a nice-0 run.

Is it normal to run enough -20 tasks to cause this problem to manifest?


On a 4P(8-way with HT), if you run a -20 task(a simple infinite loop)
it hops from one processor to another processor... you can observe it
using top.

How do you get top to display which CPU tasks are running on?


find_busiest_group() thinks there is an imbalance and ultimately the
idle cpu kicks active load balance on busy cpu, resulting in the hopping.

I'm still having trouble getting my head around this. A task shouldn't be moved unless there's at least one other task on its current CPU, it (the task to be moved) is cache cold and it (the task to be moved) is not the currently active task. In these circumstances, moving the task to an idle CPU should be a "good thing" unless the time taken for the move is longer than the time that will pass before the task becomes the running task on its current CPU.

For a nice==-20 task to be hopping from CPU to CPU there must be higher (or equal) priority tasks running on each of the CPUs that it gets booted off at the time it gets booted off.



b) On a lightly loaded system, this can result in HT

scheduler optimizations

being disabled in presence of low priority tasks... in this

case, they(low

priority ones) can end up running on the same package, even

in the presence

of other idle packages.. Though this is not as serious as

"a" above...

I think that this issue comes under the heading of "Result of better nice enforcement" which is the purpose of the patch :-). I wouldn't call this HT disablement or do I misunderstand the issue.

The only way that I can see load balancing subverting the HT scheduling mechanisms is if (say) there are 2 CPUs with 2 HT channels each and all of the high priority tasks end up sharing the 2 channels of one CPU while all of the low priority tasks share the 2 channels of the other one. This scenario is far more likely to happen without the smpnice patches than with them.


I agree with you.. But lets take a DP system with HT, now if there are
only two low priority tasks running, ideally we should be running them
on two different packages. With this patch, we may end up running on the
same logical processor.. leave alone running on the same package..
As these are low priority tasks, it might be ok.. But...

Yes, this is an issue but it's not restricted to HT systems (except for the same package part). The latest patch that I've posted addresses (part of) this problem by replacing SCHED_LOAD_SCALE with the average load per runnable task in the code at the end of find_busiest_group() which handles the case where imbalance is small. This should enable load balancing to occur even if all runnable tasks are low priority.

The issue of the two tasks ending up on the same package isn't (as far as I can see) caused by the smpnice load balancing changes.

Peter
--
Peter Williams pwil3058@xxxxxxxxxxxxxx

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/