Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig

From: Morten Rasmussen
Date: Tue Sep 08 2015 - 12:49:46 EST

Next message: Jiang Liu: "Re: [Bugfix] PCI, x86: Correctly allocate IRQs for PCI devices managed by non-PCI drivers"
Previous message: Greg Kroah-Hartman: "Re: [PATCH 3/5] staging: most: remove driver owner"
In reply to: bsegall: "Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig"
Next in thread: Peter Zijlstra: "Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Sep 08, 2015 at 03:31:58PM +0100, Morten Rasmussen wrote:
> On Tue, Sep 08, 2015 at 02:52:05PM +0200, Peter Zijlstra wrote:
> >
> > Something like teh below..
> >
> > Another thing to ponder; the downside of scaled_delta_w is that its
> > fairly likely delta is small and you loose all bits, whereas the weight
> > is likely to be large can could loose a fwe bits without issue.
>
> That issue applies both to load and util.
>
> >
> > That is, in fixed point scaling like this, you want to start with the
> > biggest numbers, not the smallest, otherwise you loose too much.
> >
> > The flip side is of course that now you can share a multiplcation.
>
> But if we apply the scaling to the weight instead of time, we would only
> have to apply it once and not three times like it is now? So maybe we
> can end up with almost the same number of multiplications.
>
> We might be loosing bits for low priority task running on cpus at a low
> frequency though.

Something like the below. We should be saving one multiplication.

--- 8< ---

From: Morten Rasmussen <morten.rasmussen@xxxxxxx>
Date: Tue, 8 Sep 2015 17:15:40 +0100
Subject: [PATCH] sched/fair: Scale load/util contribution rather than time

When updating load/util tracking the time delta might be very small (1)
in many cases, scaling it futher down with frequency and cpu invariance
scaling might cause us to loose precision. Instead of scaling time we
can scale the weight of the task for load and the capacity for
utilization. Both weight (>=15) and capacity should be significantly
bigger in most cases. Low priority tasks might still suffer a bit but
worst should be improved, as weight is at least 15 before invariance
scaling.

Signed-off-by: Morten Rasmussen <morten.rasmussen@xxxxxxx>
---
kernel/sched/fair.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9301291..d5ee72a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2519,8 +2519,6 @@ static u32 __compute_runnable_contrib(u64 n)
#error "load tracking assumes 2^10 as unit"
#endif

-#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
-
/*
* We can represent the historical contribution to runnable average as the
* coefficients of a geometric series. To do this we sub-divide our runnable
@@ -2553,10 +2551,10 @@ static __always_inline int
__update_load_avg(u64 now, int cpu, struct sched_avg *sa,
unsigned long weight, int running, struct cfs_rq *cfs_rq)
{
- u64 delta, scaled_delta, periods;
+ u64 delta, periods;
u32 contrib;
- unsigned int delta_w, scaled_delta_w, decayed = 0;
- unsigned long scale_freq, scale_cpu;
+ unsigned int delta_w, decayed = 0;
+ unsigned long scaled_weight = 0, scale_freq, scale_freq_cpu = 0;

delta = now - sa->last_update_time;
/*
@@ -2577,8 +2575,13 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
return 0;
sa->last_update_time = now;

- scale_freq = arch_scale_freq_capacity(NULL, cpu);
- scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+ if (weight || running)
+ scale_freq = arch_scale_freq_capacity(NULL, cpu);
+ if (weight)
+ scaled_weight = weight * scale_freq >> SCHED_CAPACITY_SHIFT;
+ if (running)
+ scale_freq_cpu = scale_freq * arch_scale_cpu_capacity(NULL, cpu)
+ >> SCHED_CAPACITY_SHIFT;

/* delta_w is the amount already accumulated against our next period */
delta_w = sa->period_contrib;
@@ -2594,16 +2597,15 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
* period and accrue it.
*/
delta_w = 1024 - delta_w;
- scaled_delta_w = cap_scale(delta_w, scale_freq);
if (weight) {
- sa->load_sum += weight * scaled_delta_w;
+ sa->load_sum += scaled_weight * delta_w;
if (cfs_rq) {
cfs_rq->runnable_load_sum +=
- weight * scaled_delta_w;
+ scaled_weight * delta_w;
}
}
if (running)
- sa->util_sum += scaled_delta_w * scale_cpu;
+ sa->util_sum += delta_w * scale_freq_cpu;

delta -= delta_w;

@@ -2620,25 +2622,23 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,

/* Efficiently calculate \sum (1..n_period) 1024*y^i */
contrib = __compute_runnable_contrib(periods);
- contrib = cap_scale(contrib, scale_freq);
if (weight) {
- sa->load_sum += weight * contrib;
+ sa->load_sum += scaled_weight * contrib;
if (cfs_rq)
- cfs_rq->runnable_load_sum += weight * contrib;
+ cfs_rq->runnable_load_sum += scaled_weight * contrib;
}
if (running)
- sa->util_sum += contrib * scale_cpu;
+ sa->util_sum += contrib * scale_freq_cpu;
}

/* Remainder of delta accrued against u_0` */
- scaled_delta = cap_scale(delta, scale_freq);
if (weight) {
- sa->load_sum += weight * scaled_delta;
+ sa->load_sum += scaled_weight * delta;
if (cfs_rq)
- cfs_rq->runnable_load_sum += weight * scaled_delta;
+ cfs_rq->runnable_load_sum += scaled_weight * delta;
}
if (running)
- sa->util_sum += scaled_delta * scale_cpu;
+ sa->util_sum += delta * scale_freq_cpu;

sa->period_contrib += delta;

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jiang Liu: "Re: [Bugfix] PCI, x86: Correctly allocate IRQs for PCI devices managed by non-PCI drivers"
Previous message: Greg Kroah-Hartman: "Re: [PATCH 3/5] staging: most: remove driver owner"
In reply to: bsegall: "Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig"
Next in thread: Peter Zijlstra: "Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]