Re: [PATCH 0/10 v2] sched/fair: Fix statistics with delayed dequeue

From: Vincent Guittot
Date: Mon Dec 02 2024 - 04:18:12 EST


On Sun, 1 Dec 2024 at 14:30, Mike Galbraith <efault@xxxxxx> wrote:
>
> Greetings,
>
> On Fri, 2024-11-29 at 17:17 +0100, Vincent Guittot wrote:
> > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > lag has elapsed. As a result, it stays also visible in the statistics that
> > are used to balance the system and in particular the field h_nr_running.
> >
> > This serie fixes those metrics by creating a new h_nr_queued that tracks
> > all queued tasks. It renames h_nr_running into h_nr_runnable and restores
> > the behavior of h_nr_running i.e. tracking the number of fair tasks that
> > want to run.
> >
> > h_nr_runnable is used in several places to make decision on load balance:
> > - PELT runnable_avg
> > - deciding if a group is overloaded or has spare capacity
> > - numa stats
> > - reduced capacity management
> > - load balance between groups
>
> I took the series for a spin in tip v6.12-10334-gb1b238fba309, but
> runnable seems to have an off-by-one issue, causing it to wander ever
> further south.
>
> patches 1-3 applied.
> .h_nr_runnable : -3046
> .runnable_avg : 450189777126

Yeah, I messed up something around finish_delayed_dequeue_entity().
I'm' going to prepare a v3

>
> full set applied.
> .h_nr_runnable : -5707
> .runnable_avg : 4391793519526
>
> -Mike