Re: [PATCH] sched: Optimize pick_next_task for idle_sched_class too

From: Peter Zijlstra
Date: Thu Feb 23 2017 - 08:56:44 EST


On Thu, Feb 23, 2017 at 04:04:22PM +0530, Pavan Kondeti wrote:
> Hi Peter,
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 49ce1cb..51ca21e 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3321,15 +3321,14 @@ static inline void schedule_debug(struct task_struct *prev)
> > static inline struct task_struct *
> > pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> > {
> > - const struct sched_class *class = &fair_sched_class;
> > + const struct sched_class *class;
> > struct task_struct *p;
> >
> > /*
> > * Optimization: we know that if all tasks are in
> > * the fair class we can call that function directly:
> > */
> > - if (likely(prev->sched_class == class &&
> > - rq->nr_running == rq->cfs.h_nr_running)) {
> > + if (likely(rq->nr_running == rq->cfs.h_nr_running)) {
> > p = fair_sched_class.pick_next_task(rq, prev, rf);
> > if (unlikely(p == RETRY_TASK))
> > goto again;
>
> Would this delay pulling RT tasks from other CPUs? Lets say this CPU
> has 2 fair tasks and 1 RT task. The RT task is sleeping now. Earlier,
> we attempt to pull RT tasks from other CPUs in pick_next_task_rt(),
> which is not done anymore.

It should not; the two places of interrests are when we leave the RT
class to run anything lower (fair,idle), at which point we'll pull,
or when an RT tasks wakes up, at which point it'll push.