Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking

From: Boqun Feng
Date: Fri Jun 19 2015 - 03:57:33 EST

Next message: Paolo Bonzini: "Re: [PATCH 3/5] vhost: support upto 509 memory regions"
Previous message: Michael S. Tsirkin: "Re: [PATCH 3/5] vhost: support upto 509 memory regions"
In reply to: Yuyang Du: "Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking"
Next in thread: Yuyang Du: "Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Yuyang,

On Fri, Jun 19, 2015 at 07:05:54AM +0800, Yuyang Du wrote:
> On Fri, Jun 19, 2015 at 02:00:38PM +0800, Boqun Feng wrote:
> > However, update_cfs_rq_load_avg() only updates cfs_rq->avg, the change
> > won't be contributed or aggregated to cfs_rq's parent in the
> > for_each_leaf_cfs_rq loop, therefore that's actually not a bottom-up
> > update.
> >
> > To fix this, I think we can add a update_cfs_shares(cfs_rq) after
> > update_cfs_rq_load_avg(). Like:
> >
> > for_each_leaf_cfs_rq(rq, cfs_rq) {
> > - /*
> > - * Note: We may want to consider periodically releasing
> > - * rq->lock about these updates so that creating many task
> > - * groups does not result in continually extending hold time.
> > - */
> > - __update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> > + /* throttled entities do not contribute to load */
> > + if (throttled_hierarchy(cfs_rq))
> > + continue;
> > +
> > + update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
> > + update_cfs_share(cfs_rq);
> > }
> >
> > However, I think update_cfs_share isn't cheap, because it may do a
> > bottom-up update once called. So how about just update the root cfs_rq?
> > Like:
> >
> > - /*
> > - * Iterates the task_group tree in a bottom up fashion, see
> > - * list_add_leaf_cfs_rq() for details.
> > - */
> > - for_each_leaf_cfs_rq(rq, cfs_rq) {
> > - /*
> > - * Note: We may want to consider periodically releasing
> > - * rq->lock about these updates so that creating many task
> > - * groups does not result in continually extending hold time.
> > - */
> > - __update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> > - }
> > + update_cfs_rq_load_avg(rq_clock_task(rq), rq->cfs_rq);
>
> Hi Boqun,
>
> Did I get you right:
>
> This rewrite patch does not NEED to aggregate entity's load to cfs_rq,
> but rather directly update the cfs_rq's load (both runnable and blocked),
> so there is NO NEED to iterate all of the cfs_rqs.

Actually, I'm not sure whether we NEED to aggregate or NOT.

>
> So simply updating the top cfs_rq is already equivalent to the stock.
>

The stock does have a bottom up update, so simply updating the top
cfs_rq is not equivalent to it. Simply updateing the top cfs_rq is
equivalent to the rewrite patch, because the rewrite patch lacks of the
aggregation.

> It is better if we iterate the cfs_rq to update the actually weight
> (update_cfs_share), because the weight may have already changed, which
> would in turn change the load. But update_cfs_share is not cheap.
>
> Right?

You get me right for most part ;-)

My points are:

1. We *may not* need to aggregate entity's load to cfs_rq in
update_blocked_averages(), simply updating the top cfs_rq may be just
fine, but I'm not sure, so scheduler experts' insights are needed here.

2. Whether we need to aggregate or not, the update_blocked_averages() in
the rewrite patch could be improved. If we need to aggregate, we have to
add something like update_cfs_shares(). If we don't need, we can just
replace the loop with one update_cfs_rq_load_avg() on root cfs_rq.

I think we'd better to figure out the "may not" part in point 1 first to
get a reasonable implemenation of update_blocked_averages().

Is that clear now?

Thanks and Best Regards,
Boqun

Attachment: signature.asc
Description: PGP signature

Next message: Paolo Bonzini: "Re: [PATCH 3/5] vhost: support upto 509 memory regions"
Previous message: Michael S. Tsirkin: "Re: [PATCH 3/5] vhost: support upto 509 memory regions"
In reply to: Yuyang Du: "Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking"
Next in thread: Yuyang Du: "Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]