Re: [PATCH v5 02/10] sched/rt: add rt_rq utilization tracking

From: Patrick Bellasi
Date: Wed May 30 2018 - 05:32:18 EST


On 29-May 15:29, Vincent Guittot wrote:
> Hi Patrick,
> >> +static inline bool rt_rq_has_blocked(struct rq *rq)
> >> +{
> >> + if (rq->avg_rt.util_avg)
> >
> > Should use READ_ONCE?
>
> I was expecting that there will be only one read by default but I can
> add READ_ONCE

I would say here it's required mainly for "documentation" purposes,
since we can use this function from non rq-locked paths, e.g.

update_sg_lb_stats()
update_nohz_stats()
update_blocked_averages()
rt_rq_has_blocked()

Thus, AFAIU, we should use READ_ONCE to "flag" that the value can
potentially be updated concurrently?

> >
> >> + return true;
> >> +
> >> + return false;
> >
> > What about just:
> >
> > return READ_ONCE(rq->avg_rt.util_avg);
> >
> > ?
>
> This function is renamed and extended with others tracking in the
> following patches so we have to test several values in the function.
> That's also why there is the if test because additional if test are
> going to be added

Right, makes sense.

[...]

> >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> >> index ef3c4e6..b4148a9 100644
> >> --- a/kernel/sched/rt.c
> >> +++ b/kernel/sched/rt.c
> >> @@ -5,6 +5,8 @@
> >> */
> >> #include "sched.h"
> >>
> >> +#include "pelt.h"
> >> +
> >> int sched_rr_timeslice = RR_TIMESLICE;
> >> int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
> >>
> >> @@ -1572,6 +1574,9 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> >>
> >> rt_queue_push_tasks(rq);
> >>
> >> + update_rt_rq_load_avg(rq_clock_task(rq), rq,
> >> + rq->curr->sched_class == &rt_sched_class);
> >> +
> >> return p;
> >> }
> >>
> >> @@ -1579,6 +1584,8 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
> >> {
> >> update_curr_rt(rq);
> >>
> >> + update_rt_rq_load_avg(rq_clock_task(rq), rq, 1);
> >> +
> >> /*
> >> * The previous task needs to be made eligible for pushing
> >> * if it is still active
> >> @@ -2308,6 +2315,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued)
> >> struct sched_rt_entity *rt_se = &p->rt;
> >>
> >> update_curr_rt(rq);
> >> + update_rt_rq_load_avg(rq_clock_task(rq), rq, 1);
> >
> > Mmm... not entirely sure... can't we fold
> > update_rt_rq_load_avg() into update_curr_rt() ?
> >
> > Currently update_curr_rt() is used in:
> > dequeue_task_rt
> > pick_next_task_rt
> > put_prev_task_rt
> > task_tick_rt
> >
> > while we update_rt_rq_load_avg() only in:
> > pick_next_task_rt
> > put_prev_task_rt
> > task_tick_rt
> > and
> > update_blocked_averages
> >
> > Why we don't we need to update at dequeue_task_rt() time ?
>
> We are tracking rt rq and not sched entities so we want to know when
> sched rt will be the running or not sched class on the rq. Tracking
> dequeue_task_rt is useless

What about (push) migrations?

--
#include <best/regards.h>

Patrick Bellasi