Re: [RFC PATCH 0/5] introduce sched-idle balancing
From: Abel Wu
Date: Fri Feb 25 2022 - 05:46:39 EST
On 2/25/22 4:29 PM, Vincent Guittot Wrote:
On Fri, 25 Feb 2022 at 07:46, Abel Wu <wuyun.abel@xxxxxxxxxxxxx> wrote:
Hi Peter,
On 2/24/22 11:20 PM, Peter Zijlstra Wrote:
On Thu, Feb 17, 2022 at 11:43:56PM +0800, Abel Wu wrote:
Current load balancing is mainly based on cpu capacity
and task util, which makes sense in the POV of overall
throughput. While there still might be some improvement
can be done by reducing number of overloaded cfs rqs if
sched-idle or idle rq exists.
I'm much confused, there is an explicit new-idle balancer and a periodic
idle balancer already there.
The two balancers are triggered on the rqs that have no tasks on them,
and load_balance() seems don't show a preference for non-idle tasks so
The load balance will happen at the idle pace if a sched_idle task is
running on the cpu so you will have an ILB on each cpu that run a
sched-idle task
I'm afraid I don't quite follow you, since sched-idle balancer doesn't
touch the ILB part, can you elaborate on this? Thanks.
there might be possibility that only idle tasks are pulled during load
balance while overloaded rqs (rq->cfs.h_nr_running > 1) exist. As a
There is a LB_MIN feature (disable by default) that filters task with
very low load ( < 16) which includes sched-idle task which has a max
load of 3
This feature might not that friendly to the situation that only
sched-idle tasks are running in the system. And this situation
can last more than half a day in our co-location systems in which
the training/batch tasks are placed under idle groups or directly
assigned to SCHED_IDLE.
result the normal tasks, mostly latency-critical ones in our case, on
that overloaded rq still suffer waiting for each other. I observed this
through perf sched.
IOW the main difference from the POV of load_balance() between the
latency-critical tasks and the idle ones is load.
The sched-idle balancer is triggered on the sched-idle rqs periodically
and the newly-idle ones. It does a 'fast' pull of non-idle tasks from
the overloaded rqs to the sched-idle/idle ones to let the non-idle tasks
make full use of cpu resources.
The sched-idle balancer only focuses on non-idle tasks' performance, so
it can introduce overall load imbalance, and that's why I put it before
load_balance().
According to the very low weight of a sched-idle task, I don't expect
much imbalance because of sched-idle tasks. But this also depends of
the number of sched-idle task.
Best Regards,
Abel