Re: [RFC] sched: Limit idle_balance() when it is being used too frequently

From: Rik van Riel
Date: Thu Jul 18 2013 - 08:00:55 EST


On 07/18/2013 05:32 AM, Peter Zijlstra wrote:
On Wed, Jul 17, 2013 at 09:02:24PM -0700, Jason Low wrote:

I ran a few AIM7 workloads for the 8 socket HT enabled case and I needed
to set N to more than 20 in order to get the big performance gains.

One thing that I thought of was to have N be based on how often idle
balance attempts does not pull task(s).

For example, N can be calculated based on the number of idle balance
attempts for the CPU since the last "successful" idle balance attempt.
So if the previous 30 idle balance attempts resulted in no tasks moved,
then n = 30 / 5. So idle balance gets less time to run as the number of
unneeded idle balance attempts increases, and thus N will not be set too
high during situations where idle balancing is "successful" more often.
Any comments on this idea?

It would be good to get a solid explanation for why we need such high N.
But yes that might work.

I have some idea, though no proof :)

I suspect a lot of the idle balancing time is spent waiting for
and acquiring the runqueue locks of remote CPUs.

If we spend half our idle time causing contention to remote
runqueue locks, we could be a big factor in keeping those other
CPUs from getting work done.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/