Re: [PATCH v5 03/10] cpufreq/schedutil: add rt utilization tracking
From: Juri Lelli
Date: Thu May 31 2018 - 04:46:19 EST
On 30/05/18 17:46, Quentin Perret wrote:
> Hi Vincent,
>
> On Friday 25 May 2018 at 15:12:24 (+0200), Vincent Guittot wrote:
> > Add both cfs and rt utilization when selecting an OPP for cfs tasks as rt
> > can preempt and steal cfs's running time.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 14 +++++++++++---
> > 1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 28592b6..a84b5a5 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -56,6 +56,7 @@ struct sugov_cpu {
> > /* The fields below are only needed when sharing a policy: */
> > unsigned long util_cfs;
> > unsigned long util_dl;
> > + unsigned long util_rt;
> > unsigned long max;
> >
> > /* The field below is for single-CPU policies only: */
> > @@ -178,14 +179,21 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
> > sg_cpu->max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu);
> > sg_cpu->util_cfs = cpu_util_cfs(rq);
> > sg_cpu->util_dl = cpu_util_dl(rq);
> > + sg_cpu->util_rt = cpu_util_rt(rq);
> > }
> >
> > static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
> > {
> > struct rq *rq = cpu_rq(sg_cpu->cpu);
> > + unsigned long util;
> >
> > - if (rq->rt.rt_nr_running)
> > - return sg_cpu->max;
> > + if (rq->rt.rt_nr_running) {
> > + util = sg_cpu->max;
>
> So I understand why we want to got to max freq when a RT task is running,
> but I think there are use cases where we might want to be more conservative
> and use the util_avg of the RT rq instead. The first use case is
> battery-powered devices where going to max isn't really affordable from
> an energy standpoint. Android, for example, has been using a RT
> utilization signal to select OPPs for quite a while now, because going
> to max blindly is _very_ expensive.
>
> And the second use-case is thermal pressure. On some modern CPUs, going to
> max freq can lead to stringent thermal capping very quickly, at the
> point where your CPUs might not have enough capacity to serve your tasks
> properly. And that can ultimately hurt the very RT tasks you originally
> tried to run fast. In these systems, in the long term, you'd be better off
> not asking for more than what you really need ...
Proposed the same at last LPC. Peter NAKed it (since RT is all about
meeting deadlines, and when using FIFO/RR we don't really know how fast
the CPU should go to meet them, so go to max is the only safe decision).
> So what about having a sched_feature to select between going to max and
> using the RT util_avg ? Obviously the default should keep the current
> behaviour.
Peter, would SCHED_FEAT make a difference? :)
Or Patrick's utilization capping applied to RT..