Re: [RFC PATCH v4 1/2] sched/fair: Introduce short duration task check
From: Chen Yu
Date: Fri Jan 06 2023 - 03:36:33 EST
Hi Dietmar,
thanks for reviewing the patch!
On 2023-01-05 at 12:33:16 +0100, Dietmar Eggemann wrote:
> On 16/12/2022 07:11, Chen Yu wrote:
>
> [...]
>
> > @@ -5995,6 +6005,18 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >
> > static void set_next_buddy(struct sched_entity *se);
> >
> > +static inline void dur_avg_update(struct task_struct *p, bool task_sleep)
> > +{
> > + u64 dur;
> > +
> > + if (!task_sleep)
> > + return;
> > +
> > + dur = p->se.sum_exec_runtime - p->se.prev_sum_exec_runtime_vol;
> > + p->se.prev_sum_exec_runtime_vol = p->se.sum_exec_runtime;
>
> Shouldn't se->prev_sum_exec_runtime_vol be set in enqueue_task_fair()
> and not in dequeue_task_fair()->dur_avg_update()? Otherwise `dur` will
> contain sleep time.
>
After the task p is dequeued, p's sum_exec_runtime will not be increased.
Unless task p is switched in again, p's sum_exec_runtime will continue to
increase. So dur should not include the sleep time, because we substract
between the sum_exec_runtime rather than rq->clock_task. Not sure if I understand
this correctly?
My original thought was that, record the average run time of every section:
Only consider that task voluntarily relinquishes the CPU.
For example, suppose on CPU1, task p1 and p2 run alternatively:
--------------------> time
| p1 runs 1ms | p2 preempt p1 | p1 switch in, runs 0.5ms and blocks |
^ ^ ^
|_____________| |_____________________________________|
^
|
p1 dequeued
p1's duration in one section is (1 + 0.5)ms. Because if p2 does not
preempt p1, p1 can run 1.5ms. This reflects the nature of a task,
how long it wishes to run at most.
> Like we do for se->prev_sum_exec_runtime in set_next_entity() but for
> one `set_next_entity()-put_prev_entity()` run section.
>
> AFAICS, you want to measure the exec_runtime sum over all run sections
> between enqueue and dequeue.
Yes, we tried to record the 'decayed' average exec_runtime for each section.
Say, task p runs for a ms , then p is dequeued and blocks for b ms, and then
runs for c ms, its average duration is 0.875 * a + 0.125 * c , which is
what update_avg() does.
thanks,
Chenyu