Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

From: Paul E. McKenney
Date: Tue Jul 11 2017 - 14:11:20 EST


On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
> > > From: Aubrey Li <aubrey.li@xxxxxxxxxxxxxxx>
> > >
> > > The system will enter a fast idle loop if the predicted idle period
> > > is shorter than the threshold.
> > > ---
> > > kernel/sched/idle.c | 9 ++++++++-
> > > 1 file changed, 8 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > > index cf6c11f..16a766c 100644
> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
> > > */
> > > static void do_idle(void)
> > > {
> > > + unsigned int predicted_idle_us;
> > > + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
> > > /*
> > > * If the arch has a polling bit, we maintain an invariant:
> > > *
> > > @@ -291,7 +293,12 @@ static void do_idle(void)
> > >
> > > __current_set_polling();
> > >
> > > - cpuidle_generic();
> > > + predicted_idle_us = cpuidle_predict();
> > > +
> > > + if (likely(predicted_idle_us < short_idle_threshold))
> > > + cpuidle_fast();
> >
> > What if we get here from nohz_full usermode execution? In that
> > case, if I remember correctly, the scheduling-clock interrupt
> > will still be disabled, and would have to be re-enabled before
> > we could safely invoke cpuidle_fast().
> >
> > Or am I missing something here?
>
> That's a good point. It's partially ok because if the tick is needed
> for something specific, it is not entirely stopped but programmed to that
> deadline.
>
> Now there is some idle specific code when we enter dynticks-idle. See
> tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event()
> and some subsystems that react differently when we enter dyntick idle
> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
>
> For now I'd rather suggest that we treat full nohz as an exception case here
> and do:
>
> if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < short_idle_threshold))
> cpuidle_fast();
>
> Ugly but safer!

Works for me!

Thanx, Paul