Re: [PATCH 1/5] sched/fair: Filter false overloaded_group case for EAS
From: Vincent Guittot
Date: Fri Sep 06 2024 - 02:51:42 EST
On Mon, 2 Sept 2024 at 11:01, Hongyan Xia <hongyan.xia2@xxxxxxx> wrote:
>
> On 30/08/2024 14:03, Vincent Guittot wrote:
> > With EAS, a group should be set overloaded if at least 1 CPU in the group
> > is overutilized bit it can happen that a CPU is fully utilized by tasks
> > because of clamping the compute capacity of the CPU. In such case, the CPU
> > is not overutilized and as a result should not be set overloaded as well.
> >
> > group_overloaded being a higher priority than group_misfit, such group can
> > be selected as the busiest group instead of a group with a mistfit task
> > and prevents load_balance to select the CPU with the misfit task to pull
> > the latter on a fitting CPU.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 12 +++++++++++-
> > 1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index fea057b311f6..e67d6029b269 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9806,6 +9806,7 @@ struct sg_lb_stats {
> > enum group_type group_type;
> > unsigned int group_asym_packing; /* Tasks should be moved to preferred CPU */
> > unsigned int group_smt_balance; /* Task on busy SMT be moved */
> > + unsigned long group_overutilized; /* No CPU is overutilized in the group */
>
> Does this have to be unsigned long? I think a shorter width like bool
> (or int to be consistent with other fields) expresses the intention.
yes an unsigned int is enough
>
> Also the comment to me is a bit confusing. All other fields are positive
> but this one's comment is in a negative tone.
Coming from the 1st way I implemented it but then I forgot to update
the comment. Should be:
/* At least one CPU is overutilized in the group */
>
> > unsigned long group_misfit_task_load; /* A CPU has a task too big for its capacity */
> > #ifdef CONFIG_NUMA_BALANCING
> > unsigned int nr_numa_running;
> > @@ -10039,6 +10040,13 @@ group_has_capacity(unsigned int imbalance_pct, struct sg_lb_stats *sgs)
> > static inline bool
> > group_is_overloaded(unsigned int imbalance_pct, struct sg_lb_stats *sgs)
> > {
> > + /*
> > + * With EAS and uclamp, 1 CPU in the group must be overutilized to
> > + * consider the group overloaded.
> > + */
> > + if (sched_energy_enabled() && !sgs->group_overutilized)
> > + return false;
> > +
> > if (sgs->sum_nr_running <= sgs->group_weight)
> > return false;
> >
> > @@ -10252,8 +10260,10 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> > if (nr_running > 1)
> > *sg_overloaded = 1;
> >
> > - if (cpu_overutilized(i))
> > + if (cpu_overutilized(i)) {
> > *sg_overutilized = 1;
> > + sgs->group_overutilized = 1;
> > + }
> >
> > #ifdef CONFIG_NUMA_BALANCING
> > sgs->nr_numa_running += rq->nr_numa_running;