Re: [PATCH 01/15] sched/fair: Add avg_vruntime

From: Peter Zijlstra
Date: Wed Oct 11 2023 - 03:30:51 EST


On Wed, Oct 11, 2023 at 12:15:28PM +0800, Abel Wu wrote:
> On 5/31/23 7:58 PM, Peter Zijlstra wrote:
> > +/*
> > + * Compute virtual time from the per-task service numbers:
> > + *
> > + * Fair schedulers conserve lag:
> > + *
> > + * \Sum lag_i = 0
> > + *
> > + * Where lag_i is given by:
> > + *
> > + * lag_i = S - s_i = w_i * (V - v_i)
>
> Since the ideal service time S is task-specific, should this be:
>
> lag_i = S_i - s_i = w_i * (V - v_i)

It is not, S is the same for all tasks. Remember, the base form is a
differential equation and all tasks progress at the same time at dt/w_i
while S progresses at dt/W.

Infinitesimals are awesome, just not feasible in a discrete system like
a time-sharing computer.

> > + *
> > + * Where S is the ideal service time and V is it's virtual time counterpart.
> > + * Therefore:
> > + *
> > + * \Sum lag_i = 0
> > + * \Sum w_i * (V - v_i) = 0
> > + * \Sum w_i * V - w_i * v_i = 0
> > + *
> > + * From which we can solve an expression for V in v_i (which we have in
> > + * se->vruntime):
> > + *
> > + * \Sum v_i * w_i \Sum v_i * w_i
> > + * V = -------------- = --------------
> > + * \Sum w_i W
> > + *
> > + * Specifically, this is the weighted average of all entity virtual runtimes.
> > + *
> > + * [[ NOTE: this is only equal to the ideal scheduler under the condition
> > + * that join/leave operations happen at lag_i = 0, otherwise the
> > + * virtual time has non-continguous motion equivalent to:
> > + *
> > + * V +-= lag_i / W
> > + *
> > + * Also see the comment in place_entity() that deals with this. ]]
> > + *
> > + * However, since v_i is u64, and the multiplcation could easily overflow
> > + * transform it into a relative form that uses smaller quantities:
> > + *
> > + * Substitute: v_i == (v_i - v0) + v0
> > + *
> > + * \Sum ((v_i - v0) + v0) * w_i \Sum (v_i - v0) * w_i
> > + * V = ---------------------------- = --------------------- + v0
> > + * W W
> > + *
> > + * Which we track using:
> > + *
> > + * v0 := cfs_rq->min_vruntime
> > + * \Sum (v_i - v0) * w_i := cfs_rq->avg_vruntime
>
> IMHO 'sum_runtime' would be more appropriate? Since it actually is
> the summed real time rather than virtual time. And also 'sum_load'
> instead of 'avg_load'.

Given we subtract v0 (min_vruntime) and play games with fixed point
math, I don't think it makes sense to change this name. The purpose is
to compute the weighted average of things, lets keep the current name.