Re: [PATCH] sched/fair: Introduce fits_capacity()

From: Quentin Perret
Date: Thu Jun 06 2019 - 04:16:33 EST


On Thursday 06 Jun 2019 at 08:22:04 (+0530), Viresh Kumar wrote:
> On 05-06-19, 10:16, Quentin Perret wrote:
> > Hi Viresh,
> >
> > On Tuesday 04 Jun 2019 at 12:31:52 (+0530), Viresh Kumar wrote:
> > > The same formula to check utilization against capacity (after
> > > considering capacity_margin) is already used at 5 different locations.
> > >
> > > This patch creates a new macro, fits_capacity(), which can be used from
> > > all these locations without exposing the details of it and hence
> > > simplify code.
> > >
> > > All the 5 code locations are updated as well to use it..
> > >
> > > Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> > > ---
> > > kernel/sched/fair.c | 14 +++++++-------
> > > 1 file changed, 7 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 7f8d477f90fe..db3a218b7928 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -102,6 +102,8 @@ int __weak arch_asym_cpu_priority(int cpu)
> > > * (default: ~20%)
> > > */
> > > static unsigned int capacity_margin = 1280;
> > > +
> > > +#define fits_capacity(cap, max) ((cap) * capacity_margin < (max) * 1024)
> > > #endif
> > >
> > > #ifdef CONFIG_CFS_BANDWIDTH
> > > @@ -3727,7 +3729,7 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
> > >
> > > static inline int task_fits_capacity(struct task_struct *p, long capacity)
> > > {
> > > - return capacity * 1024 > task_util_est(p) * capacity_margin;
> > > + return fits_capacity(task_util_est(p), capacity);
> > > }
> > >
> > > static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
> > > @@ -5143,7 +5145,7 @@ static inline unsigned long cpu_util(int cpu);
> > >
> > > static inline bool cpu_overutilized(int cpu)
> > > {
> > > - return (capacity_of(cpu) * 1024) < (cpu_util(cpu) * capacity_margin);
> > > + return !fits_capacity(cpu_util(cpu), capacity_of(cpu));
> >
> > This ...
> >
> > > }
> > >
> > > static inline void update_overutilized_status(struct rq *rq)
> > > @@ -6304,7 +6306,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
> > > /* Skip CPUs that will be overutilized. */
> > > util = cpu_util_next(cpu, p, cpu);
> > > cpu_cap = capacity_of(cpu);
> > > - if (cpu_cap * 1024 < util * capacity_margin)
> > > + if (!fits_capacity(util, cpu_cap))
> >
> > ... and this isn't _strictly_ equivalent to the existing code but I
> > guess we can live with the difference :-)
>
> Yes, I missed the == part it seems. Good catch. Though as you said,
> maybe we don't need to take that into account and can live with the
> new macro :)
>
> >
> > > continue;
> > >
> > > /* Always use prev_cpu as a candidate. */
> > > @@ -7853,8 +7855,7 @@ group_is_overloaded(struct lb_env *env, struct sg_lb_stats *sgs)
> > > static inline bool
> > > group_smaller_min_cpu_capacity(struct sched_group *sg, struct sched_group *ref)
> > > {
> > > - return sg->sgc->min_capacity * capacity_margin <
> > > - ref->sgc->min_capacity * 1024;
> > > + return fits_capacity(sg->sgc->min_capacity, ref->sgc->min_capacity);
> > > }
> > >
> > > /*
> > > @@ -7864,8 +7865,7 @@ group_smaller_min_cpu_capacity(struct sched_group *sg, struct sched_group *ref)
> > > static inline bool
> > > group_smaller_max_cpu_capacity(struct sched_group *sg, struct sched_group *ref)
> > > {
> > > - return sg->sgc->max_capacity * capacity_margin <
> > > - ref->sgc->max_capacity * 1024;
> > > + return fits_capacity(sg->sgc->max_capacity, ref->sgc->max_capacity);
> > > }
> > >
> > > static inline enum
> > > --
> > > 2.21.0.rc0.269.g1a574e7a288b
> > >
> >
> > Also, since we're talking about making the capacity_margin code more
> > consistent, one small thing I had in mind: we have a capacity margin
> > in sugov too, which happens to be 1.25 has well (see map_util_freq()).
> > Conceptually, capacity_margin in fair.c and the sugov margin are both
> > about answering: "do I have enough CPU capacity to serve X of util, or
> > do I need more ?"
> >
> > So perhaps we should factorize the capacity_margin code some more to use
> > it in both places in a consistent way ? This could be done in a separate
> > patch, though.
>
> Hmm, even if the values are same currently I am not sure if we want
> the same for ever.

Right, that's a good point. It is cheaper to raise the freq on a CPU than
to migrate a task, so perhaps there could be a case for different
thresholds ...

> I will write a patch for it though, if Peter/Rafael
> feel the same as you.

Sounds good, thanks !
Quentin