Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

From: Peter Zijlstra
Date: Fri Aug 27 2021 - 10:51:20 EST


On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > +/**
> > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks
> > + * @dst_cpu: Destination CPU of the load balancing
> > + * @sds: Load-balancing data with statistics of the local group
> > + * @sgs: Load-balancing statistics of the candidate busiest group
> > + * @sg: The candidate busiet group
> > + *
> > + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
> > + * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
> > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> > + *
> > + * If both @dst_cpu and @sg have SMT siblings, even the number of idle CPUs
> > + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
> > + * between the number of busy CPUs is 2 or more. If the difference is of 1,
> > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings
> > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
> > + * has lower priority.
> > + */
> > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> > + struct sg_lb_stats *sgs,
> > + struct sched_group *sg)
> > +{
> > +#ifdef CONFIG_SCHED_SMT
> > + bool local_is_smt, sg_is_smt;
> > + int sg_busy_cpus;
> > +
> > + local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> > + sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> > +
> > + sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> > +
> > + if (!local_is_smt) {
> > + /*
> > + * If we are here, @dst_cpu is idle and does not have SMT
> > + * siblings. Pull tasks if candidate group has two or more
> > + * busy CPUs.
> > + */
> > + if (sg_is_smt && sg_busy_cpus >= 2)
> > + return true;
> > +
> > + /*
> > + * @dst_cpu does not have SMT siblings. @sg may have SMT
> > + * siblings and only one is busy. In such case, @dst_cpu
> > + * can help if it has higher priority and is idle.
> > + */
> > + return !sds->local_stat.group_util &&
>
> sds->local_stat.group_util can't be used to decide if a CPU or group
> of CPUs is idle. util_avg is usually not null when a CPU becomes idle
> and you can have to wait more than 300ms before it becomes Null
> At the opposite, the utilization of a CPU can be null but a task with
> null utilization has just woken up on it.
> Utilization is used to reflect the average work of the CPU or group of
> CPUs but not the current state

If you want immediate idle, sgs->nr_running == 0 or sgs->idle_cpus ==
sgs->group_weight come to mind.

> > + sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > + }
> > +
> > + /* @dst_cpu has SMT siblings. */
> > +
> > + if (sg_is_smt) {
> > + int local_busy_cpus = sds->local->group_weight -
> > + sds->local_stat.idle_cpus;
> > + int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> > +
> > + /* Local can always help to even the number busy CPUs. */
>
> default behavior of the load balance already tries to even the number
> of idle CPUs.

Right, but I suppose this is because we're trapped here and have to deal
with the SMT-SMT case too. Ricardo, can you clarify?

> > + if (busy_cpus_delta >= 2)
> > + return true;
> > +
> > + if (busy_cpus_delta == 1)
> > + return sched_asym_prefer(dst_cpu,
> > + sg->asym_prefer_cpu);
> > +
> > + return false;
> > + }
> > +
> > + /*
> > + * @sg does not have SMT siblings. Ensure that @sds::local does not end
> > + * up with more than one busy SMT sibling and only pull tasks if there
> > + * are not busy CPUs. As CPUs move in and out of idle state frequently,
> > + * also check the group utilization to smoother the decision.
> > + */
> > + if (!sds->local_stat.group_util)
>
> same comment as above about the meaning of group_util == 0
>
> > + return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > +
> > + return false;
> > +#else
> > + /* Always return false so that callers deal with non-SMT cases. */
> > + return false;
> > +#endif
> > +}
> > +
> > static inline bool
> > sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sg_lb_stats *sgs,
> > struct sched_group *group)
> > {
> > + /* Only do SMT checks if either local or candidate have SMT siblings */
> > + if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
> > + (group->flags & SD_SHARE_CPUCAPACITY))
> > + return asym_smt_can_pull_tasks(env->dst_cpu, sds, sgs, group);
> > +
> > return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
> > }