Re: [PATCH 3/3] sched: Spare idle load balancing on nohz_full CPUs
From: Frederic Weisbecker
Date: Tue Jun 20 2017 - 16:26:44 EST
On Tue, Jun 20, 2017 at 01:42:27PM -0400, Rik van Riel wrote:
> On Mon, 2017-06-19 at 04:12 +0200, Frederic Weisbecker wrote:
> > Although idle load balancing obviously only concern idle CPUs, it can
> > be a disturbance on a busy nohz_full CPU. Indeed a CPU can only get
> > rid
> > of an idle load balancing duty once a tick fires while it runs a task
> > and this can take a while in a nohz_full CPU.
> > We could fix that and escape the idle load balancing duty from the
> > very
> > idle exit path but that would bring unecessary overhead. Lets just
> > not
> > bother and leave that job to housekeeping CPUs (those outside
> > nohz_full
> > range). The nohz_full CPUs simply don't want any disturbance.
> > Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Rik van Riel <riel@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d711093..cfca960 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8659,6 +8659,10 @@ void nohz_balance_enter_idle(int cpu)
> > if (!cpu_active(cpu))
> > return;
> > + /* Spare idle load balancing on CPUs that don't want to be
> > disturbed */
> > + if (!is_housekeeping_cpu(cpu))
> > + return;
> > +
> > if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
> > return;
> I am not entirely convinced on this one.
> Doesn't the if (on_null_domain(cpu_rq(cpu)) test
> a few lines down take care of this already?
It shouldn't, since nohz_full= doesn't imply isolcpus= anymore.
Of course it does if the user manually adds them.
> Do we want nohz_full to always automatically
> imply that no idle balancing will happen, like
> on isolated CPUs?
You're making a good point in that I would prefer that nohz_full be
only about the tick and let some sort of separate isolation subsystem
deal with individual isolation features: nohz, workqueues, idle load
That's why I rather used is_housekeeping_cpu() and not !tick_nohz_full_cpu()
because for now housekeepers are ~tick_nohz_full_mask but later it should be
cpu_possible_mask by default or some given set of CPUs defined by the future