Re: [RFC] sched/fair: Align vruntime to last_se when curr_se's timeslice run out

From: Peter Zijlstra
Date: Thu Dec 06 2018 - 04:36:48 EST


On Wed, Dec 05, 2018 at 08:41:39PM +0800, weiqi (C) wrote:

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index ee271bb..1f61b9c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4020,7 +4020,23 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
> ideal_runtime = sched_slice(cfs_rq, curr);
> delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
> if (delta_exec > ideal_runtime) {
> + struct rb_node *next = NULL;
> + struct rb_node *right_most = NULL;
> + struct sched_entity *last;
> resched_curr(rq_of(cfs_rq));
> +
> + /* always set to max vruntime */
> + if (cfs_rq->nr_running > 1) {
> + next = &curr->run_node;
> + do {
> + right_most = next;
> + next = rb_next(next);
> + } while (next);
> +
> + last = rb_entry(right_most,
> + struct sched_entity, run_node);

This you can obviously do better by tracking max_vruntime along side
min_vruntime. But for testing this should work fine I suppose.

> + curr->vruntime = last->vruntime + 1; // maybe +1 is not needed

This however is completely broken... you've basically reduced a virtual
runtime scheduler to a simple RR one.

Yes, place_entity() is not ideal, for starters we should not insert
relative to min_vruntime but to the 0-lag point (weighted average
vruntime). And IIRC, we should not let negative lag tasks reduce the
runqueue weight.

But those things are computationally expensive to do, so we fudged it.

> + }
> +
> /*
> * The current task ran long enough, ensure it doesn't get
> * re-elected due to buddy favours.