Re: [PATCH] sched: Forward deadline for early tick

From: zihan zhou
Date: Tue Dec 24 2024 - 21:48:59 EST


From: zhouzihan30 <zhouzihan30@xxxxxx>

Thank you for your reply!

> Having delta of rq_clock toggling above or below 1ms is normal because
> of the clockevent precision, if the previous delta was longer than 1ms
> then the next one will be shorter. But the average of several ticks
> remains 1ms like in your trace above
>
> > than 1ms
> >
> > In order to conduct a comparative experiment, I turned off those CONFIG
> > and re checked the changes in clock, It is found that the values of
> > rq clock and rq clock task become completely consistent, However,
> > according to the information from perf, there are still errors in tick
> > (slice=3ms) :
>
> Did you check that the whole tick was accounted for the task ?
> According to your trace of rq clock delta and rq clock task delta
> above, most of the sum of 3 consecutives tick is greater than 3ms for
> rq clock delta so I would assume that the sum of delta_exec would be
> greater than 3ms as well after 3 ticks
>
> >
> > time cpu task name wait time sch delay run time
> > [tid/pid] (msec) (msec) (msec)
> > ---------- ------ ------------ --------- --------- ---------
> > 110.436513 [0001] perf[1414] 0.000 0.000 0.000
> > 110.440490 [0001] bash[1341] 0.000 0.000 3.977
> > 110.441490 [0001] bash[1344] 0.000 0.000 0.999
> > 110.441548 [0001] perf[1414] 4.976 0.000 0.058
> > 110.445491 [0001] bash[1344] 0.058 0.000 3.942
> > 110.449490 [0001] bash[1341] 5.000 0.000 3.999
> > 110.452490 [0001] bash[1344] 3.999 0.000 2.999
> > 110.456491 [0001] bash[1341] 2.999 0.000 4.000
> > 110.460489 [0001] bash[1344] 4.000 0.000 3.998
> > 110.463490 [0001] bash[1341] 3.998 0.000 3.001
> > 110.467493 [0001] bash[1344] 3.001 0.000 4.002
> > 110.471490 [0001] bash[1341] 4.002 0.000 3.996
> > 110.474489 [0001] bash[1344] 3.996 0.000 2.999
> > 110.477490 [0001] bash[1341] 2.999 0.000 3.000
> >

I use perf to record the impact of tick errors on runtime. When slice=3ms,
two busy tasks compete for one CPU. If one task runs for 4ms, it means
that three ticks are less than 3ms. The task can only switch to another
task after running 4ms on the next tick, which is 1ms more. Based on my
observation, about 50% of the time will be like this (no
CONFIG_IRQ_TIME_ACCOUNTING, if there is, there will be more time for the
task to run for 4ms even if slice=3ms).


> > We once considered subtracting a little from a slice when setting it,
> > for example, if someone sets 3ms, we can subtract 0.1ms from it and
> > make it 2.9ms. But this is not a good solution. If someone sets it to
> > 3.1ms, should we use 2.9ms or 3ms? There doesn't seem to be a
> > particularly good option, and it may lead to even greater system errors.
>
> And we end up giving less than its slice to task which could have set
> it to this value for a good reason.

Thank you, I think we have reached an agreement that the time given to a
task at once should be less than or equal to the slice. EEVDF never
guarantees that a task must run a slice at once, but the kernel ensures
this. However, due to tick errors, there have been some issues like task
has exceeded the allotted time.
I will propose patch v2 to try to solve this problem.