Re: [PATCH v4 6/7] sched/deadline: Deferrable dl server

From: Daniel Bristot de Oliveira
Date: Wed Sep 06 2023 - 10:59:09 EST

Next message: Andy Shevchenko: "Re: [PATCH v4 1/2] hwmon: tmp513: Add max_channels variable to struct tmp51x_data"
Previous message: Jakub Kicinski: "Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep"
In reply to: Peter Zijlstra: "Re: [PATCH v4 6/7] sched/deadline: Deferrable dl server"
Next in thread: Peter Zijlstra: "Re: [PATCH v4 6/7] sched/deadline: Deferrable dl server"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 9/6/23 10:29, Peter Zijlstra wrote:
> On Tue, Sep 05, 2023 at 05:24:40PM +0200, Daniel Bristot de Oliveira wrote:
>> On 9/5/23 15:42, Peter Zijlstra wrote:
>>> On Thu, Aug 31, 2023 at 10:28:57PM +0200, Daniel Bristot de Oliveira wrote:
>>>> +void dl_server_start(struct sched_dl_entity *dl_se, int defer)
>>>> {
>>>> + if (dl_se->server_state != DL_SERVER_STOPPED) {
>>>> + WARN_ON_ONCE(!(on_dl_rq(dl_se) || dl_se->dl_throttled));
>>>> + return;
>>>> + }
>>>> +
>>>> + if (defer) {
>>>> + /*
>>>> + * Postpone the replenishment to the (next period - the execution time)
>
> perhaps explicitly mention zero-laxity here, then we all know what is
> meant, no?

Last time I used that word it caused more problems than helped :-) But I will
add it and specify that is "for this task".

>>>> + *
>>>> + * With this in place, we have two cases:
>>>> + *
>>>> + * On the absence of DL tasks:
>>>> + * The server will start at the replenishment time, getting
>>>> + * its runtime before now + period. This is the expected
>>>> + * throttling behavior.
>>>> + *
>>>> + * In the presense of DL tasks:
>>>> + * The server will be replenished, and then it will be
>>>> + * schedule according to EDF, not breaking SCHED_DEADLINE.
>>>> + *
>>>> + * In the first cycle the server will be postponed at most
>>>> + * at period + period - runtime at most. But then the
>>>> + * server will receive its runtime/period.
>>>> + *
>>>> + * The server will, however, run on top of any RT task, which
>>>> + * is the expected throttling behavior.
>>>> + */
>>>> + dl_se->deadline = rq_clock(dl_se->rq) + dl_se->dl_period - dl_se->dl_runtime;
>>>
>>> I was confused by this, but this is an instance of
>>> replenish_dl_new_period(), where we explicitly do not preserve the
>>> period.
>>
>> yeah, it is two operations in one (so not straight forward). It sets
>> the period as the now - runtime, so..
>
> Right. I'll just attribute it to me being generally confused about
> everything after holidays ;-)
>
> The thing that tripped me was that the use of rq_clock() breaks
> periodicy, except of course you want/need that when you start a new
> activation.

that rq_clock() is only used at the time in which we start the deferred
server. If the throttling condition stays one, the server will act as a
regular periodic DL task...

>
>>>> + /* Zero the runtime */
>>>> + dl_se->runtime = 0;
>>>> + /* throttle the server */
>>>> + dl_se->dl_throttled = 1;
>>
>> as the server is throttled, it will set the replenishing timer for now - runtime + period because
>> of the deadline.
>
> Yeah, it's a wee hack to move it to the zero-laxity point. I was
> considering if it makes sense to push that down and make it available
> for all DL tasks, but I'm not sure..

It might be useful in the future, like when DL dominates all other schedulers, so
we can have a way to schedule a deferred work, like kworkers... :-) But it might be
too early for that..

>>> I'm thinking this is wrong -- ISTR there definite benefits to explicit
>>> slack time scheduling, meaning the server should also run while DL tasks
>>> are on.
>>
>> I can add the check to enable it also in the presence of DL tasks, we just need
>> to have consider that the dl server is a dl task as well when disabling.
>
> Why not keep what was there, always run the thing when there's FAIR
> tasks around? Or do you see severe overhead from running it with only
> FAIR tasks on?

Assuming that most of the cases people only have fair tasks, which is probably
true in the regular kernel... (like, no threaded IRQs).

>> It is a benefit because it will fix the case in which a CPU full of DL tasks
>> (possible with global).
>>
>> Additionally, running the server when there are only fair tasks
>>> is mostly harmless, right? So why this change?
>>
>> Rhe problem is that we never know when a RT one will arrive :-/
>>
>> E.g., only fair tasks, server enabled. All fine :-) Then an RT arrives, and gets
>> postponed by the server... RT people complaining (for those thinking on adding
>> a server for RT, we will have a similar problem as we have with throttling today +
>> DL has no fixed priority).
>
> Well, the thing is, the moment you run DL tasks on that machine those RT
> tasks will get delayed *anyway*. RT people need to stop pretending FIFO
> is the highest class.
>
> But yeah, I can see why they might get upset if the time is then used to
> run FAIR tasks while their RT task gets to wait.

right

>> We will still face the same problem with defferable server, and the example is the same,
>> just with a shift in the RT arrival. e.g., only fair task for (server period - runtime),
>> then the server gets enabled, and a 1us after the RT arrives and gets postponed... again.
>>
>> So the need to enable it only after the dispatch of a kind of RT (or DL to be added)
>> tasks that can cause starvation.
>>
>> We could simplify it by enabling only on RT/DL arrival but... if there is nothing to
>> starve... it is not needed anyways so less overhead avoiding the enqueue...
>
> So one thing we could do is have update_curr_fair() decrement
> fair_server's runtime and yield the period then it hits 0 (and capping
> it at 0, not allowing it to go negative or so).
>
> That way you only force the situation when FAIR hasn't had it's allotted
> time this perio, and only for as much as to make up for the time it
> lacks.

We can also decrease the runtime to a negative number while in defer/throttle state, and let
the while in replenish_dl_entity() to replenish with the += runtime;

here:
--- %< ---
/*
* We keep moving the deadline away until we get some
* available runtime for the entity. This ensures correct
* handling of situations where the runtime overrun is
* arbitrary large.
*/
while (dl_se->runtime <= 0) {
dl_se->deadline += pi_of(dl_se)->dl_period;
dl_se->runtime += pi_of(dl_se)->dl_runtime;
}

--- >% ---

it is already done... no?

>
>>>
>>> One of the benefits was -- IIRC, that we no longer need to subtract some
>>> random margin from the reservation limit, but there were more I think.
>>>
>>
>> We can simplify things because we can start ignoring the rt throttling limit when
>> there is no need to throttle... like on grub (but, only after we remove the rt
>> throttling).
>>
>> Thoughts?
>
> The moment the fair server is there (these patches) you should also kill
> the throttling. It doesn't make sense to have both.
>

Next message: Andy Shevchenko: "Re: [PATCH v4 1/2] hwmon: tmp513: Add max_channels variable to struct tmp51x_data"
Previous message: Jakub Kicinski: "Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep"
In reply to: Peter Zijlstra: "Re: [PATCH v4 6/7] sched/deadline: Deferrable dl server"
Next in thread: Peter Zijlstra: "Re: [PATCH v4 6/7] sched/deadline: Deferrable dl server"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]