Re: [PATCH v2 11/13] sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups

From: Morten Rasmussen
Date: Fri Jul 15 2016 - 04:38:21 EST


On Thu, Jul 14, 2016 at 09:39:23AM -0700, Sai Gurrappadi wrote:
> On 06/30/2016 12:49 AM, Morten Rasmussen wrote:
> > On Thu, Jun 23, 2016 at 02:20:48PM -0700, Sai Gurrappadi wrote:
> >> Hi Morten,
> >>
> >> On 06/22/2016 10:03 AM, Morten Rasmussen wrote:
> >>
> >> [...]
> >>
> >>>
> >>> +/*
> >>> + * group_smaller_cpu_capacity: Returns true if sched_group sg has smaller
> >>> + * per-cpu capacity than sched_group ref.
> >>> + */
> >>> +static inline bool
> >>> +group_smaller_cpu_capacity(struct sched_group *sg, struct sched_group *ref)
> >>> +{
> >>> + return sg->sgc->max_capacity * capacity_margin <
> >>> + ref->sgc->max_capacity * 1024;
> >>> +}
> >>> +
> >>> static inline enum
> >>> group_type group_classify(struct sched_group *group,
> >>> struct sg_lb_stats *sgs)
> >>> @@ -6892,6 +6903,19 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> >>> if (sgs->avg_load <= busiest->avg_load)
> >>> return false;
> >>>
> >>> + if (!(env->sd->flags & SD_ASYM_CPUCAPACITY))
> >>> + goto asym_packing;
> >>> +
> >>> + /* Candidate sg has no more than one task per cpu and has
> >>> + * higher per-cpu capacity. Migrating tasks to less capable
> >>> + * cpus may harm throughput. Maximize throughput,
> >>> + * power/energy consequences are not considered.
> >>> + */
> >>> + if (sgs->sum_nr_running <= sgs->group_weight &&
> >>> + group_smaller_cpu_capacity(sds->local, sg))
> >>> + return false;
> >>> +
> >>> +asym_packing:
> >>
> >> What about the case where IRQ/RT work reduces the capacity of some of
> >> these bigger CPUs? sgc->max_capacity might not necessarily capture
> >> that case.
> >
> > Right, we could possibly improve this by using min_capacity instead, but
> > we could end up allowing tasks to be pulled to lower capacity cpus just
> > because one big cpu has reduced capacity due to RT/IRQ pressure and
> > therefore has lowered the groups min_capacity.
> >
> > Ideally we should check all the capacities, but that complicates things
> > a lot.
> >
> > Would you prefer min_capacity instead, or attempts to consider all the
> > cpu capacities available in both groups?
> >
>
> min_capacity as a start works I think given that we are only trying to
> make existing LB better, not necessarily optimizing for every case.
> Might have to revisit this anyways for thermals etc.

Agreed, I will make it min_capacity instead of max_capacity in v3.

Thanks,
Morten