Re: [RFC 15/16] sched/fair: Account kthread runtime debt for CFS bandwidth

From: Daniel Jordan
Date: Tue Jan 18 2022 - 12:40:58 EST


On Fri, Jan 14, 2022 at 10:40:06AM +0100, Peter Zijlstra wrote:
> On Fri, Jan 14, 2022 at 10:31:55AM +0100, Peter Zijlstra wrote:
>
> > Also, by virtue of this being a start-stop annotation interface, the
> > accrued time might be arbitrarily large and arbitrarily delayed. I'm not
> > sure that's sensible.
> >
> > For tasks it might be better to mark the task and have the tick DTRT
> > instead of later trying to 'migrate' the time.
>
> Which is then very close to simply sticking the task into the right
> cgroup for a limited duration.
>
> You could do a special case sched_move_task(), that takes a css argument
> instead of using the current task_css. Then for cgroups it looks like
> nothing changes, but the scheduler will DTRT and act like it is in the
> target cgroup. Then at the end, simply move it back to task_css.

Yes, that's one of the things I tried. Less new code in the scheduler
this way.

> This obviously doesn't work for SoftIRQ accounting, but that is
> 'special' anyway. Softirq stuff is not otherwise under scheduler
> control and has preemption disabled.

It also doesn't work for memory reclaim since that shouldn't be
throttled in real time. Reclaim and softirq seem to demand something
that doesn't move tasks onto runqueues, that's external to avoid getting
pushed off cpu.