Re: [PATCH v4 6/7] sched/deadline: Deferrable dl server

From: Peter Zijlstra
Date: Tue Sep 05 2023 - 12:35:51 EST


On Thu, Aug 31, 2023 at 10:28:57PM +0200, Daniel Bristot de Oliveira wrote:
> +void dl_server_start(struct sched_dl_entity *dl_se, int defer)
> {
> + if (dl_se->server_state != DL_SERVER_STOPPED) {
> + WARN_ON_ONCE(!(on_dl_rq(dl_se) || dl_se->dl_throttled));
> + return;
> + }
> +
> + if (defer) {
> + /*
> + * Postpone the replenishment to the (next period - the execution time)
> + *
> + * With this in place, we have two cases:
> + *
> + * On the absence of DL tasks:
> + * The server will start at the replenishment time, getting
> + * its runtime before now + period. This is the expected
> + * throttling behavior.
> + *
> + * In the presense of DL tasks:
> + * The server will be replenished, and then it will be
> + * schedule according to EDF, not breaking SCHED_DEADLINE.
> + *
> + * In the first cycle the server will be postponed at most
> + * at period + period - runtime at most. But then the
> + * server will receive its runtime/period.
> + *
> + * The server will, however, run on top of any RT task, which
> + * is the expected throttling behavior.
> + */
> + dl_se->deadline = rq_clock(dl_se->rq) + dl_se->dl_period - dl_se->dl_runtime;

I was confused by this, but this is an instance of
replenish_dl_new_period(), where we explicitly do not preserve the
period.

> + /* Zero the runtime */
> + dl_se->runtime = 0;
> + /* throttle the server */
> + dl_se->dl_throttled = 1;

These comments are both obvious and placed so as to make everything
unreadable :/

> +
> + dl_se->server_state = DL_SERVER_DEFER;
> + start_dl_timer(dl_se);
> + return;
> + }
> +
> if (!dl_server(dl_se)) {
> dl_se->dl_server = 1;
> setup_new_dl_entity(dl_se);
> }
> +
> + dl_se->server_state = DL_SERVER_RUNNING;
> enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
> }


> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 580e6764a68b..b9d0f08dc8ca 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6499,9 +6499,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> */
> util_est_enqueue(&rq->cfs, p);
>
> - if (!rq->cfs.h_nr_running)
> - dl_server_start(&rq->fair_server);
> -
> /*
> * If in_iowait is set, the code below may not trigger any cpufreq
> * utilization updates, so do it here explicitly with the IOWAIT flag
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index e23cc67c9467..7595110a5a3e 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1537,6 +1537,9 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags)
>
> if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
> enqueue_pushable_task(rq, p);
> +
> + if (sched_fair_server_needed(rq))
> + dl_server_start(&rq->fair_server, rq->fair_server_defer);
> }
>
> static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int flags)
> @@ -1547,6 +1550,9 @@ static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int flags)
> dequeue_rt_entity(rt_se, flags);
>
> dequeue_pushable_task(rq, p);
> +
> + if (!sched_fair_server_needed(rq))
> + dl_server_stop(&rq->fair_server);
> }
>
> /*

I'm thinking this is wrong -- ISTR there definite benefits to explicit
slack time scheduling, meaning the server should also run while DL tasks
are on. Additionally, running the server when there are only fair tasks
is mostly harmless, right? So why this change?

One of the benefits was -- IIRC, that we no longer need to subtract some
random margin from the reservation limit, but there were more I think.