Re: [PATCH] sched/fair: Prevent cpu_busy_time from exceeding actual_cpu_capacity

From: Qais Yousef
Date: Tue Jun 18 2024 - 11:26:58 EST


On 06/18/24 17:20, Vincent Guittot wrote:

> > > Sorry, I miss the "fits_capacity() use capacity_of()", and without
> > > uclamp_max, the rd is over-utilized,
> > > and would not use feec().
> > > But I notice the uclamp_max, if the rq's uclamp_max is smaller than
> > > SCHED_CAPACITY_SCALE,
> > > and is bigger than actual_cpu_capacity, the util_fits_cpu() would
> > > return true, and the rd is not over-utilized.
> > > Is this setting intentional?
> >
> > Hmm. To a great extent yes. We didn't want to take all types of rq pressure
> > into account for uclamp_max. But this corner case could be debatable.
>
> Shouldn't we use get_actual_cpu_capacity() instead of
> arch_scale_cpu_capacity() everywhere in util_fits_cpu().
> get_actual_cpu_capacity() appeared recently and there were discussion
> about using or not the thermal load_avg but everything is fixed now
> and think that using get_actual_cpu_capacity() everywhere in
> util_fits_cpu( would make sense and cover the case reported by Xuewen
> just above

Yes agreed. I think we need both patches. Although we need to confirm that
uclamp_max is what is causing the situation Xuewen is seeing. Otherwise we have
a race somewhere that needs to be understood.