RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF
From: Doug Smythies
Date: Wed Jan 08 2025 - 10:51:19 EST
On 2025.01.08 05:12 Peter Zijlstra wrote:
> On Tue, Jan 07, 2025 at 09:15:59PM -0800, Doug Smythies wrote:
>> On 2025.07.11:24 Peter Zijlstra wrote:
>
>>> What exact cgroup config are you having? /sys/kernel/debug/sched/debug
>>> should be able to tell you.
>>
>> I do not know.
>> I'll capture the above output, compress it, and send it to you.
>>
>> I did also boot with systemd.unified_cgroup_hierarchy=0
>> and it made no difference.
>
> I think you need: "cgroup_disable=cpu noautogroup" to fully disable all
> the cpu-cgroup muck. Anyway:
>
> $ zcat cgroup2.txt.gz | grep -e yes -e turbo | awk '{print $2 "\t" $16}'
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> turbostat /autogroup-286
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> yes /user.slice/user-1000.slice/session-1.scope
> turbostat /autogroup-286
>
> That matches the scenario where I could reproduce, two competing groups.
>
> I'm seeing wild vruntime divergence when this happens -- this is
> definitely wonky. Basically the turbostat groups gets starved for a
> while while the yes group catches up.
>
> It looks like reweight_entity() is shooting out the cgroup entity to the
> right.
>
> So it builds up some negative lag (received surplus service) and then
> because turbostat goes sleep for a second, it's cgroup's share gets
> truncated to 2 and it shoots the cgroup entity out waaaaaaaay far.
>
> Thing is, waking up *should* fix that up again, but that doesn't appear
> to happen, leaving us up a creek.
>
> /me noodles a bit....
>
> Does this help?
Sorry, but no it did not help.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c0e58e51801f..daa62cfa3092 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7000,6 +7063,13 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>
> if (flags & ENQUEUE_DELAYED) {
> requeue_delayed_entity(se);
> + se = se->parent;
> + for_each_sched_entity(se) {
> + cfs_rq = cfs_rq_of(se);
> + update_load_avg(cfs_rq, se, UPDATE_TG);
> + se_update_runnable(se);
> + update_cfs_group(se);
> + }
> return;
> }