Re: sched: Improve load balancing in the presence of idle CPUs

From: Morten Rasmussen
Date: Wed Apr 01 2015 - 13:03:42 EST


On Wed, Apr 01, 2015 at 07:49:56AM +0100, Preeti U Murthy wrote:
>
> On 04/01/2015 12:24 AM, Jason Low wrote:
> > On Tue, 2015-03-31 at 14:07 +0530, Preeti U Murthy wrote:
> >> Hi Jason,
> >>
> >> On 03/31/2015 12:25 AM, Jason Low wrote:
> >>> Hi Preeti,
> >>>
> >>> I noticed that another commit 4a725627f21d converted the check in
> >>> nohz_kick_needed() from idle_cpu() to rq->idle_balance, causing a
> >>> potentially outdated value to be used if this cpu is able to pull tasks
> >>> using rebalance_domains(), and nohz_kick_needed() directly returning
> >>> false.
> >>
> >> I see that rebalance_domains() will be run at the end of the scheduler
> >> tick interrupt handling. trigger_load_balance() only sets the softirq,
> >> it does not call rebalance_domains() immediately. So the call graph
> >> would be:
> >
> > Oh right, since that only sets the softirq, this wouldn't be the issue,
> > though we would need these changes if we were to incorporate any sort of
> > nohz_kick_needed() logic into the nohz_idle_balance() code path correct?
>
> I am sorry I don't quite get this. Can you please elaborate?

I think the scenario is that we are in nohz_idle_balance() and decide to
bail out because we have pulled some tasks, but before leaving
nohz_idle_balance() we want to check if more balancing is necessary
using nohz_kick_needed() and potentially kick somebody to continue.

Note that the balance cpu is currently skipped in nohz_idle_balance(),
but if it wasn't the scenario would be possible.

In that case, we can't rely on rq->idle_balance as it would not be
up-to-date. Also, we may even want to use nohz_kick_needed(rq) where rq
!= this_rq, in which case we probably also want an updated status. It
seems that rq->idle_balance is only updated at each tick.

Or maybe I'm all wrong :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/