Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig

From: Peter Zijlstra
Date: Tue Sep 08 2015 - 08:52:24 EST


On Tue, Sep 08, 2015 at 02:26:06PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 08, 2015 at 09:22:05AM +0200, Vincent Guittot wrote:
> > No, but
> > sa->util_avg = (sa->util_sum << SCHED_CAPACITY_SHIFT) / LOAD_AVG_MAX;
> > will fix the unit issue.
>
> Tricky that, LOAD_AVG_MAX very much relies on the unit being 1<<10.
>
> And where load_sum already gets a factor 1024 from the weight
> multiplication, util_sum does not get such a factor, and all the scaling
> we do on it loose bits.
>
> So at the moment we go compute the util_avg value, we need to inflate
> util_sum with an extra factor 1024 in order to make it work.
>
> And seeing that we do the shift up on sa->util_sum without consideration
> of overflow, would it not make sense to add that factor before the
> scaling and into the addition?
>
> Now, given all that, units are a complete mess here, and I'd not mind
> something like:
>
> #if (SCHED_LOAD_SHIFT - SCHED_LOAD_RESOLUTION) != SCHED_CAPACITY_SHIFT
> #error "something usefull"
> #endif
>
> somewhere near here.

Something like teh below..

Another thing to ponder; the downside of scaled_delta_w is that its
fairly likely delta is small and you loose all bits, whereas the weight
is likely to be large can could loose a fwe bits without issue.

That is, in fixed point scaling like this, you want to start with the
biggest numbers, not the smallest, otherwise you loose too much.

The flip side is of course that now you can share a multiplcation.

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -682,7 +682,7 @@ void init_entity_runnable_average(struct
sa->load_avg = scale_load_down(se->load.weight);
sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
sa->util_avg = scale_load_down(SCHED_LOAD_SCALE);
- sa->util_sum = LOAD_AVG_MAX;
+ sa->util_sum = sa->util_avg * LOAD_AVG_MAX;
/* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
}

@@ -2515,6 +2515,10 @@ static u32 __compute_runnable_contrib(u6
return contrib + runnable_avg_yN_sum[n];
}

+#if (SCHED_LOAD_SHIFT - SCHED_LOAD_RESOLUTION) != 10 || SCHED_CAPACITY_SHIFT != 10
+#error "load tracking assumes 2^10 as unit"
+#endif
+
#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)

/*
@@ -2599,7 +2603,7 @@ __update_load_avg(u64 now, int cpu, stru
}
}
if (running)
- sa->util_sum += cap_scale(scaled_delta_w, scale_cpu);
+ sa->util_sum += scaled_delta_w * scale_cpu;

delta -= delta_w;

@@ -2623,7 +2627,7 @@ __update_load_avg(u64 now, int cpu, stru
cfs_rq->runnable_load_sum += weight * contrib;
}
if (running)
- sa->util_sum += cap_scale(contrib, scale_cpu);
+ sa->util_sum += contrib * scale_cpu;
}

/* Remainder of delta accrued against u_0` */
@@ -2634,7 +2638,7 @@ __update_load_avg(u64 now, int cpu, stru
cfs_rq->runnable_load_sum += weight * scaled_delta;
}
if (running)
- sa->util_sum += cap_scale(scaled_delta, scale_cpu);
+ sa->util_sum += scaled_delta * scale_cpu;

sa->period_contrib += delta;

@@ -2644,7 +2648,7 @@ __update_load_avg(u64 now, int cpu, stru
cfs_rq->runnable_load_avg =
div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX);
}
- sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX;
+ sa->util_avg = sa->util_sum / LOAD_AVG_MAX;
}

return decayed;
@@ -2686,8 +2690,7 @@ static inline int update_cfs_rq_load_avg
if (atomic_long_read(&cfs_rq->removed_util_avg)) {
long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0);
sa->util_avg = max_t(long, sa->util_avg - r, 0);
- sa->util_sum = max_t(s32, sa->util_sum -
- ((r * LOAD_AVG_MAX) >> SCHED_LOAD_SHIFT), 0);
+ sa->util_sum = max_t(s32, sa->util_sum - r * LOAD_AVG_MAX, 0);
}

decayed = __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/